]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
3 years agodebian/control: fix python3-cherrypy*3* dependency 45711/head
Koen Kooi [Wed, 23 Feb 2022 16:40:48 +0000 (08:40 -0800)]
debian/control: fix python3-cherrypy*3* dependency

The trailing '3' was missed in one instance, ceph-mgr-cephadm, leading to:

Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 ceph-mgr-cephadm : Depends: python3-cherrypy but it is not installable

Which makes the installation fail.

Fixes: 78983ad0d0cce422da32dc4876ac186f6d32c3f5
Signed-off-by: Koen Kooi <koen@softiron.com>
(cherry picked from commit b7b381fe91c0711249a7185b31f3dd60064f3b5a)

3 years agoMerge pull request #45695 from amathuria/amathuri-53923-fix-quincy
Yuri Weinstein [Wed, 30 Mar 2022 14:46:31 +0000 (07:46 -0700)]
Merge pull request #45695 from amathuria/amathuri-53923-fix-quincy

quincy: osd/osd_types: Increasing decode version of scrub_duration in pg stats

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #45673 from dsavineau/cephadm_container_image_stable
Yuri Weinstein [Wed, 30 Mar 2022 14:44:47 +0000 (07:44 -0700)]
Merge pull request #45673 from dsavineau/cephadm_container_image_stable

cephadm: set quincy as stable release

Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>
Reviewed-by: Adam King adking@redhat.com
3 years agoMerge pull request #45604 from cbodley/wip-quincy-arrow-submodule
Yuri Weinstein [Wed, 30 Mar 2022 14:43:30 +0000 (07:43 -0700)]
Merge pull request #45604 from cbodley/wip-quincy-arrow-submodule

quincy: cmake: add submodule for Apache Arrow at v6.0.1

Reviewed-by: galsalomon66 <gal.salomon@gmail.com>
Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
3 years agoosd/osd_types: Increasing decode version of scrub_duration in pg stats 45695/head
Aishwarya Mathuria [Tue, 29 Mar 2022 18:05:45 +0000 (23:35 +0530)]
osd/osd_types: Increasing decode version of scrub_duration in pg stats

All new fields added to pg stats after quincy RC need to have the decode field bumped up to avoid decoding errors during an upgrade from quincy RC to the quincy stable version

Fixes: https://tracker.ceph.com/issues/53923
Signed-off-by: Aishwarya Mathuria <amathuri@redhat.com>
(cherry picked from commit 3532b78901cc43ceb375da34a681e5a0f8eb53ac)

3 years agocephadm: set quincy as stable release 45673/head
Dimitri Savineau [Mon, 28 Mar 2022 14:50:51 +0000 (10:50 -0400)]
cephadm: set quincy as stable release

Quincy isn't master anymore so we don't need the DEFAULT_IMAGE_IS_MASTER
variable set to true (which produces a warning message).
This also sets the LATEST_STABLE_RELEASE variable to quincy to match the
DEFAULT_IMAGE_RELEASE variable.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
3 years agoMerge pull request #45641 from ronen-fr/wip-rf-45640-quincy
Neha Ojha [Sun, 27 Mar 2022 17:44:00 +0000 (10:44 -0700)]
Merge pull request #45641 from ronen-fr/wip-rf-45640-quincy

Quincy: osd/scrub: restart snap trimming only after scrubbing is done

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #45629 from neha-ojha/wip-quincy-stable
Neha Ojha [Sat, 26 Mar 2022 02:28:57 +0000 (19:28 -0700)]
Merge pull request #45629 from neha-ojha/wip-quincy-stable

quincy: src/ceph_release: mark quincy stable

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
3 years agoMerge pull request #45653 from ljflores/wip-quincy-fast-shutdown-backports
Neha Ojha [Sat, 26 Mar 2022 02:28:30 +0000 (19:28 -0700)]
Merge pull request #45653 from ljflores/wip-quincy-fast-shutdown-backports

Quincy: fast shutdown backports

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Sridhar Seshasayee <sseshasa@redhat.com>
3 years agoMerge pull request #45196 from adk3798/quincy-release-default-image
Josh Durgin [Sat, 26 Mar 2022 00:54:20 +0000 (17:54 -0700)]
Merge pull request #45196 from adk3798/quincy-release-default-image

quincy: cephadm: change default image to ceph/ceph:v17

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
3 years agoMerge pull request #45616 from NitzanMordhai/wip-55021-quincy
Neha Ojha [Fri, 25 Mar 2022 20:59:19 +0000 (13:59 -0700)]
Merge pull request #45616 from NitzanMordhai/wip-55021-quincy

quincy: tests: ceph_test_rados_api_watch_notify: watch2Delete reconnect

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #45615 from benhanokh/wip-55032-quincy
Neha Ojha [Fri, 25 Mar 2022 20:58:04 +0000 (13:58 -0700)]
Merge pull request #45615 from benhanokh/wip-55032-quincy

quincy: os/bluestore: Disable NCB functionality on rotational drives

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
3 years agoMerge pull request #45652 from sseshasa/wip-55069-quincy
Neha Ojha [Fri, 25 Mar 2022 19:35:06 +0000 (12:35 -0700)]
Merge pull request #45652 from sseshasa/wip-55069-quincy

quincy: Doc: Improve mclock config reference documentation & update PendingReleaseNotes.

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #45637 from idryomov/wip-diff-iterate-striping-fix-quincy
Ilya Dryomov [Fri, 25 Mar 2022 19:15:53 +0000 (20:15 +0100)]
Merge pull request #45637 from idryomov/wip-diff-iterate-striping-fix-quincy

quincy: librbd: make diff-iterate in fast-diff mode sort and merge reported extents

Reviewed-by: Christopher Hoffman <choffman@redhat.com>
3 years agoMerge pull request #45499 from cfsnyder/wip-54146-quincy
David Galloway [Fri, 25 Mar 2022 19:10:22 +0000 (15:10 -0400)]
Merge pull request #45499 from cfsnyder/wip-54146-quincy

quincy: rgw/admin: fix radosgw-admin datalog list max-entries issue

3 years agoMerge pull request #45504 from cfsnyder/wip-54154-quincy
David Galloway [Fri, 25 Mar 2022 18:44:29 +0000 (14:44 -0400)]
Merge pull request #45504 from cfsnyder/wip-54154-quincy

quincy: rgw: in bucket reshard list, clarify new num shards is tentative

3 years agoMerge pull request #45501 from cfsnyder/wip-54150-quincy
David Galloway [Fri, 25 Mar 2022 18:44:10 +0000 (14:44 -0400)]
Merge pull request #45501 from cfsnyder/wip-54150-quincy

quincy: rgw: RGWPostObj::execute() may lost data.

3 years agoMerge pull request #45498 from cfsnyder/wip-54093-quincy
David Galloway [Fri, 25 Mar 2022 18:41:40 +0000 (14:41 -0400)]
Merge pull request #45498 from cfsnyder/wip-54093-quincy

quincy: rgwlc:  warn on missing RGW_ATTR_LC

3 years agoMerge pull request #45490 from cfsnyder/wip-54076-quincy
David Galloway [Fri, 25 Mar 2022 18:41:13 +0000 (14:41 -0400)]
Merge pull request #45490 from cfsnyder/wip-54076-quincy

quincy: rgw: bucket chown bad memory usage

3 years agoqa/standalone: Fix test_activate_osd() test in ceph-helpers.sh 45653/head
Sridhar Seshasayee [Fri, 25 Mar 2022 16:40:31 +0000 (22:10 +0530)]
qa/standalone: Fix test_activate_osd() test in ceph-helpers.sh

Modify test_activate_osd() to get the type of scheduler in use and then
verify the value of osd_max_backfills. This is because mclock scheduler
overrides this option to 1000 upon OSD initialization.

The test earlier used to pass because the OSD daemon was killed but not
marked down and upon being brought up, the wait for OSD up check was
passing quickly. But the OSD still didn't have the latest config values.

But now upon killing the OSD, the osd_fast_shutdown sequence notifies the
mon (see PR: https://github.com/ceph/ceph/pull/44807) and is marked down
and dead. Upon bringing it up, the wait for OSD up check takes a longer
time and this is sufficient for the config values to be updated. This
results in the correct values being read from the config 'Values' map.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
(cherry picked from commit 3aa2df2e0f6f5bafadc96fd72935e5cf8b2fcf17)

3 years agoosd/OSD: osd_fast_shutdown_notify_mon not quite right
Nitzan Mordechai [Thu, 27 Jan 2022 13:13:28 +0000 (15:13 +0200)]
osd/OSD: osd_fast_shutdown_notify_mon not quite right

When osd_fast_shutdown and osd_fast_shutdown_notify_mon set as true, OSD marked as Down
it should be marked as Dead,

Fixed: https://tracker.ceph.com/issues/53327

Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
nd

nd

(cherry picked from commit 07302d5e41c49c885c9398c1c478638023e3f264)

3 years agoosd: make osd_fast_shutdown_notify_mon option true by default
Satoru Takeuchi [Thu, 18 Nov 2021 20:48:18 +0000 (20:48 +0000)]
osd: make osd_fast_shutdown_notify_mon option true by default

osd_fast_shutdown_notify_mon option is false by default. So users suffer
from error log flood, slow ops, and the long I/O timeouts on voluntary OS
shutdown before they are aware of the existence of this option. Let's
make this option true by default.

Fixes: https://tracker.ceph.com/issues/53328
Signed-off-by: Satoru Takeuchi <satoru.takeuchi@gmail.com>
(cherry picked from commit 729a5b85a6586b47d16acbba2cf8e765e498cd65)

3 years agoPendingReleaseNotes: Add mclock config reference link to an existing note 45652/head
Sridhar Seshasayee [Fri, 18 Mar 2022 14:55:25 +0000 (20:25 +0530)]
PendingReleaseNotes: Add mclock config reference link to an existing note

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
(cherry picked from commit 0511a8eadcc3824362fb8620a09b2796c514fd92)

3 years agodoc: Improvements to mClock configuration reference documentation
Sridhar Seshasayee [Fri, 18 Mar 2022 07:43:52 +0000 (13:13 +0530)]
doc: Improvements to mClock configuration reference documentation

Improve the documentation around.
 - mclock client types.
 - Describe in greater detail about mclock config profiles.
 - Add notes about manually benchmarking OSDs and tuning bluestore throttle
   parameters.
 - Include a couple of missing mclock configuration options.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
(cherry picked from commit afe3a7543c65a521ef0272a292d0e521ec3674c9)

3 years agoMerge pull request #45493 from cfsnyder/wip-54078-quincy
David Galloway [Fri, 25 Mar 2022 16:45:41 +0000 (12:45 -0400)]
Merge pull request #45493 from cfsnyder/wip-54078-quincy

quincy: rgw: Match decode_json with dump for default-placement in RGWZoneGroup.

3 years agoMerge pull request #45576 from idryomov/wip-fix-pids-limit-quincy
Ilya Dryomov [Fri, 25 Mar 2022 16:39:18 +0000 (17:39 +0100)]
Merge pull request #45576 from idryomov/wip-fix-pids-limit-quincy

quincy: cephadm: Remove containers pids-limit

Reviewed-by: Adam King <adking@redhat.com>
3 years agoMerge pull request #45494 from cfsnyder/wip-54084-quincy
Yuri Weinstein [Fri, 25 Mar 2022 15:03:32 +0000 (08:03 -0700)]
Merge pull request #45494 from cfsnyder/wip-54084-quincy

quincy: librgw: make rgw file handle versioned

Reviewed-by: Casey Bodley <cbodley@redhat.com>
3 years agoMerge pull request #45422 from cfsnyder/wip-54428-quincy
Yuri Weinstein [Fri, 25 Mar 2022 15:01:24 +0000 (08:01 -0700)]
Merge pull request #45422 from cfsnyder/wip-54428-quincy

quincy: rgw: add OPT_BUCKET_SYNC_RUN to gc_ops_list, so that

Reviewed-by: Casey Bodley <cbodley@redhat.com>
3 years agoosd/scrub: restart snap trimming only after scrubbing is done 45641/head
Ronen Friedman [Fri, 25 Mar 2022 10:45:47 +0000 (10:45 +0000)]
osd/scrub: restart snap trimming only after scrubbing is done

Snap trimming that was postponed as the target PG was scrubbing
must be restarted at scrub completion.
PR #38111 moved trimming restart to just before the scrub fully
terminated. The current PR fixes that.

Trimming is also restarted in those cases where scrub was
queued but aborted immediately.

Fixes: https://tracker.ceph.com/issues/52026
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(cherry picked from commit 948d3266c67bf896d1c20472977b849178d233d3)

3 years agolibrbd: make diff-iterate in fast-diff mode sort and merge reported extents 45637/head
Ilya Dryomov [Sun, 20 Mar 2022 11:10:52 +0000 (12:10 +0100)]
librbd: make diff-iterate in fast-diff mode sort and merge reported extents

Various users, the most notable example being the QEMU driver, assume
that extents are reported in image offset order.

Fixes: https://tracker.ceph.com/issues/53885
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 85e7075d5f021bd2d11024e6646d74a8a9f96e15)

3 years agosrc/ceph_release: mark quincy stable 45629/head
Neha Ojha [Fri, 25 Mar 2022 00:37:33 +0000 (00:37 +0000)]
src/ceph_release: mark quincy stable

we missed marking it dev, but it is now time for the final release
so mark it stable

Signed-off-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #45600 from aclamk/wip-55024-quincy
Yuri Weinstein [Thu, 24 Mar 2022 20:19:22 +0000 (13:19 -0700)]
Merge pull request #45600 from aclamk/wip-55024-quincy

quincy: os/bluestore/bluefs: Improve unittest for compaction

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #45592 from vumrao/wip-vumrao-55018
Yuri Weinstein [Thu, 24 Mar 2022 20:19:03 +0000 (13:19 -0700)]
Merge pull request #45592 from vumrao/wip-vumrao-55018

quincy: osd/PrimaryLogPG.cc: CEPH_OSD_OP_OMAPRMKEYRANGE should mark omap dirty

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #45590 from ronen-fr/wip-rf-44744-quincy
Yuri Weinstein [Thu, 24 Mar 2022 20:18:14 +0000 (13:18 -0700)]
Merge pull request #45590 from ronen-fr/wip-rf-44744-quincy

quincy: scrub/osd: add a missing 'publish stats to osd'

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #45585 from idryomov/wip-pool-reverse-lookup-osdmap-quincy
Yuri Weinstein [Thu, 24 Mar 2022 20:17:54 +0000 (13:17 -0700)]
Merge pull request #45585 from idryomov/wip-pool-reverse-lookup-osdmap-quincy

quincy: librados: check latest osdmap on ENOENT in pool_reverse_lookup()

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #45524 from sseshasa/wip-54612-quincy
Yuri Weinstein [Thu, 24 Mar 2022 20:17:20 +0000 (13:17 -0700)]
Merge pull request #45524 from sseshasa/wip-54612-quincy

quincy: mon, osd: Add snaptrim stats to the existing PG stats.

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoquincy: cephadm: change default image to ceph/ceph:v17 45196/head
Adam King [Mon, 28 Feb 2022 13:06:44 +0000 (08:06 -0500)]
quincy: cephadm: change default image to ceph/ceph:v17

Should be merged right before the final release is cut (but not before)

Signed-off-by: Adam King <adking@redhat.com>
3 years agotests: ceph_test_rados_api_watch_notify: watch2Delete reconnect 45616/head
NitzanMordhai [Sun, 13 Mar 2022 08:52:59 +0000 (08:52 +0000)]
tests: ceph_test_rados_api_watch_notify: watch2Delete reconnect

During test LibRadosWatchNotify.Watch2Delete rados_watch_check can return error -102 if reconnect happened, in that case Broken pipe reconnect and -102 returned

Fixes: https://tracker.ceph.com/issues/51307
Signed-off-by: Nitzan Mordechai <nmordec@redhat.com>
Signed-off-by: NitzanMordhai <nmordech@redhat.com>
(cherry picked from commit 8c8414a953f198113cec038f83e78e52127f3cc4)

3 years agoFix a problem in store_test::BluestoreBrokenNoSharedBlobRepairTest where the check... 45615/head
Gabriel BenHanokh [Mon, 21 Mar 2022 10:54:10 +0000 (12:54 +0200)]
Fix a problem in store_test::BluestoreBrokenNoSharedBlobRepairTest where the check for active null-fm was wrong and so reporting bogus errors  when null-fm was inactive
The check need to access dynamic value and not config setting (which can be overridden)
Signed-off-by: Gabriel BenHanokh <gbenhano@redhat.com>
(cherry picked from commit 2969539d20a8157d62ae27f842c43b801efdc0ee)

3 years agoBug-Fix from PR-44370 force setting need_to_destage_allocation_file to True on device...
Gabriel BenHanokh [Thu, 17 Mar 2022 20:26:58 +0000 (22:26 +0200)]
Bug-Fix from PR-44370 force setting need_to_destage_allocation_file to True on device expansion without checking if we work in null-fm mode
Signed-off-by: Gabriel Benhanokh <gbenhano@redhat.com>
(cherry picked from commit f7ebef8a804b8ce193bcbee4284dc28102708f37)

3 years agoos/bluestore: Disable NCB functionality on rotational drives
Gabriel BenHanokh [Thu, 10 Mar 2022 15:40:31 +0000 (17:40 +0200)]
os/bluestore: Disable NCB functionality on rotational drives
NCB code needs to recover allocation map after an OSD crash.
The recovery process on rotational drives is about 20x slower than SSD making this solution unacceptable for that environment 

Signed-off-by: Gabriel Benhanokh <gbenhano@redhat.com>
(cherry picked from commit 5fd09658edbf636dd462facfa9878656f641e7de)

3 years agodeb: add build profile for system arrow 45604/head
Casey [Fri, 4 Feb 2022 21:34:30 +0000 (13:34 -0800)]
deb: add build profile for system arrow

Signed-off-by: Casey <cbodley@redhat.com>
(cherry picked from commit e8460cbd5af1f2f88c52d8e955805bd09d9c3701)

3 years agoceph.spec.in: add system_arrow and system_utf8proc conditions
Casey [Fri, 4 Feb 2022 21:15:19 +0000 (13:15 -0800)]
ceph.spec.in: add system_arrow and system_utf8proc conditions

Signed-off-by: Casey <cbodley@redhat.com>
(cherry picked from commit 223c5e8dc03500017acf903077ef322c62c05f7b)

3 years agocmake: move Arrow targets into find_package modules
Casey [Fri, 4 Feb 2022 21:02:52 +0000 (13:02 -0800)]
cmake: move Arrow targets into find_package modules

Signed-off-by: Casey <cbodley@redhat.com>
(cherry picked from commit 5da406a4ee57b740e03506872465749d8201f50d)

3 years agocmake: use arrow's find_package modules
Casey [Fri, 4 Feb 2022 20:53:08 +0000 (12:53 -0800)]
cmake: use arrow's find_package modules

Signed-off-by: Casey <cbodley@redhat.com>
(cherry picked from commit 433782dbd5668a011bf90181e98547130abe54ef)

3 years agocmake: add WITH_SYSTEM_ARROW to skip submodule build
Casey [Fri, 4 Feb 2022 20:31:58 +0000 (12:31 -0800)]
cmake: add WITH_SYSTEM_ARROW to skip submodule build

relies on a hack to find the installed ParquetConfig.cmake

Signed-off-by: Casey <cbodley@redhat.com>
(cherry picked from commit ed60aeed0b28f0138408c3f34ede9c7898c01e54)

3 years agoceph.spec.in: seastar drops _FORTIFY_SOURCE from CFLAGS also
Casey Bodley [Fri, 4 Feb 2022 14:51:24 +0000 (09:51 -0500)]
ceph.spec.in: seastar drops _FORTIFY_SOURCE from CFLAGS also

the arrow submodule builds some C sources that trip up on _FORTIFY_SOURCE in debug builds

[ 79%] Building C object src/arrow/CMakeFiles/arrow_objlib.dir/vendored/musl/strptime.c.o
In file included from /usr/include/time.h:25,
                 from /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-10531-gc73e1fda/rpm/el8/BUILD/ceph-17.0.0-10531-gc73e1fda/src/arrow/cpp/src/arrow/vendored/strptime.h:20,
                 from /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-10531-gc73e1fda/rpm/el8/BUILD/ceph-17.0.0-10531-gc73e1fda/src/arrow/cpp/src/arrow/vendored/musl/strptime.c:4:
/usr/include/features.h:381:4: error: #warning _FORTIFY_SOURCE requires compiling with optimization (-O) [-Werror=cpp]
  381 | #  warning _FORTIFY_SOURCE requires compiling with optimization (-O)
      |    ^~~~~~~
cc1: all warnings being treated as errors
make[5]: *** [src/arrow/CMakeFiles/arrow_objlib.dir/build.make:2543: src/arrow/CMakeFiles/arrow_objlib.dir/vendored/musl/strptime.c.o] Error 1

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 2d80f0cd258df65cdba6f1bb4115f7797e9e5677)

3 years agocmake: add submodule for utf8proc at v2.2.0
Casey Bodley [Fri, 28 Jan 2022 18:44:56 +0000 (13:44 -0500)]
cmake: add submodule for utf8proc at v2.2.0

adds utf8proc submodule, needed by the arrow submodule in centos. add a
WITH_SYSTEM_UTF8PROC option that controls whether or not utf8proc is
built from submodule

non-system utf8proc is built as a static library to avoid conflicts with
system-provided libraries

ceph.spec.in sets WITH_SYSTEM_UTF8PROC=OFF until it's available in
centos

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit b10364dc21d964465dc0192b1c600bb8c6963213)

3 years agocmake: add submodule for Apache Arrow at v6.0.1
Casey Bodley [Thu, 20 Jan 2022 15:22:27 +0000 (10:22 -0500)]
cmake: add submodule for Apache Arrow at v6.0.1

adds an arrow submodule. when WITH_RADOSGW_SELECT_PARQUET is enabled,
the submodule is built as an external project and rgw links against its
imported Arrow::Parquet target

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 2ca6d75521541e99ebb6101f6d350f92a6797a8b)

Conflicts:
CMakeLists.txt master has an extra option WITH_RADOSGW_MOTR

3 years agoMerge pull request #45594 from neha-ojha/wip-45512-quincy
Ilya Dryomov [Wed, 23 Mar 2022 18:55:42 +0000 (19:55 +0100)]
Merge pull request #45594 from neha-ojha/wip-45512-quincy

quincy: ceph/admin: s/master/main

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
3 years agoMerge pull request #45367 from Matan-B/wip-54508-quincy
Neha Ojha [Wed, 23 Mar 2022 18:14:20 +0000 (11:14 -0700)]
Merge pull request #45367 from Matan-B/wip-54508-quincy

quincy: Revert "doc/dev: Running workunits locally"

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
3 years agoos/bluestore/bluefs: Improve unittest for compaction 45600/head
Adam Kupczyk [Thu, 3 Mar 2022 14:36:58 +0000 (15:36 +0100)]
os/bluestore/bluefs: Improve unittest for compaction

Improved unittest for compaction to add some files after compacting.
It is used to prove that there is a problem with sync compaction.

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit 40160365f344ebfb43173a5366015ac4cdb7a3fe)

3 years agoceph/admin: s/master/main 45594/head
Zac Dover [Thu, 17 Mar 2022 23:05:45 +0000 (09:05 +1000)]
ceph/admin: s/master/main

This PR changes the name "master" to "main" so
that builds (and, I assume, a great many other
things) will not fail.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 6a1dd3a8a2f3dc9fe8615d402c9041273516ff89)

3 years agoosd/PrimaryLogPG.cc: CEPH_OSD_OP_OMAPRMKEYRANGE should mark omap dirty 45592/head
Neha Ojha [Wed, 16 Mar 2022 18:37:19 +0000 (18:37 +0000)]
osd/PrimaryLogPG.cc: CEPH_OSD_OP_OMAPRMKEYRANGE should mark omap dirty

We should mark_omap_dirty() for all omap write ops, just like we did
in cb927925af1f3df4b9c31df85cf31f982aae1988.

Currently, for CEPH_OSD_OP_OMAPRMKEYRANGE ops, clean_omap gets set to true,
which results in incomplete recovery of objects and results in
inconsistent PGs after a scrub.

Fixes: https://tracker.ceph.com/issues/54592
Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit f7fd5895fd3d7d7c4691be91434868d90f7a4e0f)

3 years agoscrub/osd: add a missing 'publish stats to osd' 45590/head
Ronen Friedman [Sun, 23 Jan 2022 06:54:58 +0000 (08:54 +0200)]
scrub/osd: add a missing 'publish stats to osd'

to publish the last scrub status report.
The change is needed following the merge of
PR #42735.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(cherry picked from commit ab032e9ac577b32c47528ae32c91b652079288c3)

3 years agolibrados: check latest osdmap on ENOENT in pool_reverse_lookup() 45585/head
Ilya Dryomov [Wed, 16 Mar 2022 19:05:56 +0000 (20:05 +0100)]
librados: check latest osdmap on ENOENT in pool_reverse_lookup()

Avoid spurious ENOENT errors from rados_pool_reverse_lookup() and
Rados::pool_reverse_lookup().

This makes lookup by id consistent with lookup by name: the latter
has been checking latest osdmap since commit 7e5669b11b14 ("rados: we
need to get the latest osdmap when pool does not exists").

Fixes: https://tracker.ceph.com/issues/54593
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 1f837e233af32c8a66f88508cde534c361ecfcbc)

3 years agoMerge pull request #45273 from idryomov/wip-rbd-quincy-batch-5
Ilya Dryomov [Wed, 23 Mar 2022 11:54:41 +0000 (12:54 +0100)]
Merge pull request #45273 from idryomov/wip-rbd-quincy-batch-5

quincy: rbd backports (batch 5)

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
3 years agoqa/suites/orch/cephadm: restrict test_iscsi_pids_limit to CentOS 45576/head
Ilya Dryomov [Tue, 22 Mar 2022 10:36:18 +0000 (11:36 +0100)]
qa/suites/orch/cephadm: restrict test_iscsi_pids_limit to CentOS

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit f0ade57458b93f8401de8670ae62bf2295a6c40c)

[ commit 1f714da81440 ("qa: fix or add missing .qa links") not in
  quincy -- added qa/suites/orch/cephadm/workunits/task/.qa ]

3 years agocephadm: remove containers pids-limit
Teoman ONAY [Thu, 11 Nov 2021 15:05:49 +0000 (15:05 +0000)]
cephadm: remove containers pids-limit

The default pids-limit (docker 4096/podman 2048) prevent some
customization from working (http threads on RGW) or limits the number
of luns per iscsi target.

Fixes: https://tracker.ceph.com/issues/52898
Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit de8b3c2676e65eb61df54c65cfd3b3af1e68da56)

3 years agoMerge pull request #45383 from idryomov/windows-build-fix-quincy
Ilya Dryomov [Tue, 22 Mar 2022 20:33:30 +0000 (21:33 +0100)]
Merge pull request #45383 from idryomov/windows-build-fix-quincy

quincy: include: Define dlfcn.h on Windows

Reviewed-by: Kamoltat Sirivadhna <ksirivad@redhat.com>
3 years agolibrbd: readv/writev fix iovecs length computation overflow 45273/head
Jonas Pfefferle [Wed, 9 Mar 2022 13:26:42 +0000 (14:26 +0100)]
librbd: readv/writev fix iovecs length computation overflow

iovec have unsigned length (size_t) and before this patch the
total length was computed by adding iovec's length to a signed
length variable (ssize_t). While the code checked if the resulting
length was negative on overflow, the case where length is positive
after overflow was not checked. This patch fixes the overflow check
by changing length to unsigned size_t.

Additionally, this patch fixes the case where some iovecs have been
added to the bufferlist and the aio completion has been blocked, but
adding an additional iovec fails because of overflow. This leads to
the UserBufferDeleter trying to unblock the completion on destruction
of the bufferlist but asserting because the completion was never
armed. We avoid this by first computing the total length and checking
for overflows and iovcnt before adding them to the bufferlist.

Signed-off-by: Jonas Pfefferle <pepperjo@japf.ch>
(cherry picked from commit e50405ef857f487bc1c104bbf3e8859ea099a0c4)

3 years agotest/librbd: add test to verify diff_iterate size
Christopher Hoffman [Mon, 7 Mar 2022 18:35:56 +0000 (18:35 +0000)]
test/librbd: add test to verify diff_iterate size

Add test case to verify diff size values of image and multiple
snapshots.

Fixes: https://tracker.ceph.com/issues/54440
Signed-off-by: Christopher Hoffman <choffman@redhat.com>
(cherry picked from commit d4e44df1be2bafa1c0ceabc73bb7243104fc7ad4)

3 years agoqa/workunits/rbd/cli_generic.sh: relax trash purge schedule status assert
Ilya Dryomov [Sat, 19 Mar 2022 13:04:52 +0000 (14:04 +0100)]
qa/workunits/rbd/cli_generic.sh: relax trash purge schedule status assert

Commit 08df6e0fd006 ("qa/workunits/rbd: expand LevelSpec parsing
coverage") didn't account for images with a separate data pool.  This
was missed because of small-cache-pool.yaml breakage.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 90a90ad47dd3140c796ef4da7263c9633d34e841)

3 years agoosd: Add snaptrim duration to pg dump stats. 45524/head
Sridhar Seshasayee [Mon, 14 Mar 2022 20:08:57 +0000 (01:38 +0530)]
osd: Add snaptrim duration to pg dump stats.

Add the snaptrim duration to the json formatted output of the pg dump
stats. Define methods for a PG to set the snaptrim begin time and then to
calculate the total time spent to trim all the objects for the snaps in
the snap_trimq for the PG.

Tests:
  - Librados C and C++ API tests to verify the time spent for a snaptrim
    operation on a PG. These tests use the self-managed snaps APIs.
  - Standalone tests to verify snaptrim duration using rados pool snaps.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
(cherry picked from commit a86ead953dc5fa2c78a4fe86700b0c1aba2727af)

3 years agomon, osd: Add objects trimmed to pg dump stats.
Sridhar Seshasayee [Thu, 17 Feb 2022 11:38:36 +0000 (17:08 +0530)]
mon, osd: Add objects trimmed to pg dump stats.

Add a new column, OBJECTS_TRIMMED, to the pg dump stats that shows the
number of objects trimmed when a snap is removed.

When a pg splits, the stats from the parent pg is copied to the child
pg. In such a case, reset objects_trimmed to 0 for the child pg
(see PeeringState::split_into()). Otherwise, this will result in incorrect
stats to be shown for a child pg after the split operation.

Tests:
 - Librados C and C++ API tests to verify the number of objects trimmed
   during snaptrim operation. These tests use the self-managed snaps APIs.
 - Standalone tests to verify objects trimmed using rados pool snaps.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
(cherry picked from commit 00249dc0cc69d4c065acbb33543d10cb360930dc)

3 years agoMerge pull request #45471 from amathuria/wip-54601-quincy
Yuri Weinstein [Fri, 18 Mar 2022 17:40:14 +0000 (10:40 -0700)]
Merge pull request #45471 from amathuria/wip-54601-quincy

quincy: osd/scrub: add scrub duration to pg stats

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Sridhar Seshasayee <sseshasa@redhat.com>
3 years agoMerge pull request #45396 from kamoltat/wip-ksirivad-quincy-backport-45078
Yuri Weinstein [Fri, 18 Mar 2022 17:39:19 +0000 (10:39 -0700)]
Merge pull request #45396 from kamoltat/wip-ksirivad-quincy-backport-45078

quincy: mon/MonCommands.h: fix target_size_ratio range

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #45363 from kamoltat/wip-ksirivad-quincy-backport-45200
Yuri Weinstein [Fri, 18 Mar 2022 17:38:42 +0000 (10:38 -0700)]
Merge pull request #45363 from kamoltat/wip-ksirivad-quincy-backport-45200

quincy: osd/osd_types: pg_num_max reordering

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #45315 from dang/wip-dang-fix-inverted
Yuri Weinstein [Fri, 18 Mar 2022 15:47:09 +0000 (08:47 -0700)]
Merge pull request #45315 from dang/wip-dang-fix-inverted

quincy: RGW - Fix inverted return check

Reviewed-by: Casey Bodley <cbodley@redhat.com>
3 years agoMerge pull request #45331 from nmshelke/wip-54477-quincy
Yuri Weinstein [Thu, 17 Mar 2022 21:50:27 +0000 (14:50 -0700)]
Merge pull request #45331 from nmshelke/wip-54477-quincy

quincy: ceph-fuse: perform cleanup if test_dentry_handling failed

Reviewed-by: Venky Shankar vshankar@redhat.com
Reviewed-by: Kotresh HR khiremat@redhat.com
3 years agorgw: in bucket reshard list, clarify new num shards is tentative 45504/head
J. Eric Ivancich [Wed, 22 Dec 2021 19:45:59 +0000 (14:45 -0500)]
rgw: in bucket reshard list, clarify new num shards is tentative

With dynamic bucket index resharding, when the average number of
objects per shard exceeds the configured value, that bucket is
scheduled for reshard. That bucket may receive more new objects before
the resharding takes place. As a result, the existing code
re-calculates the number of new shards just prior to resharding,
rather than waste a resharding opportunity with too low a value.

The same holds true for a user-scheduled resharding.

A user reported confusion that the number reported in `radosgw-admin
reshard list` wasn't the number that the reshard operation ultimately
used. This commit makes it clear that the new number of shards is
"tentative". And test_rgw_reshard.py is updated to reflect this
altered output.

Additionally this commit adds some modernization and efficiency to the
"reshard list" subcommand.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit aa0071ce8b8594b92c0bed2be7a9bf35bfff8cac)

3 years agorgw: RGWPostObj::execute() may lost data. 45501/head
Lei Zhang [Wed, 14 Jul 2021 09:30:48 +0000 (17:30 +0800)]
rgw: RGWPostObj::execute() may lost data.

Signed-off-by: Lei Zhang <1091517373@qq.com>
(cherry picked from commit f241a330dcb5968f9ec1de1a382572258cb6daac)

3 years agorgw/admin: fix radosgw-admin datalog list max-entries issue 45499/head
Yuval Lifshitz [Wed, 2 Feb 2022 14:53:21 +0000 (16:53 +0200)]
rgw/admin: fix radosgw-admin datalog list max-entries issue

Fixes: https://tracker.ceph.com/issues/54116
Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>
(cherry picked from commit bd429ed9bec8aa4dc17c61a07e30987f50f7e5f6)

3 years agorgwlc: warn on missing RGW_ATTR_LC 45498/head
Matt Benjamin [Fri, 24 Dec 2021 19:35:00 +0000 (14:35 -0500)]
rgwlc:  warn on missing RGW_ATTR_LC

This should not happen.  If it does (e.g., due to damaged bucket_info),
log the event to assist with debugging.

Fixes: https://tracker.ceph.com/issues/53728
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit ae1a75c09d11d8f0b626c781112c35de353c0c89)

3 years agolibrgw: move RGWFileHandle::encode/decode to the private sector 45494/head
Xuehan Xu [Sun, 25 Apr 2021 07:24:08 +0000 (15:24 +0800)]
librgw: move RGWFileHandle::encode/decode to the private sector

To prevent RGWFileHandle::encode/decode methods to be invoked directly by
other modules

Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
(cherry picked from commit 068c5e7ff1286ac4d5624f6e6bd7dedc21b34095)

3 years agolibrgw: make rgw file handle versioned
Xuehan Xu [Sat, 2 Jan 2021 14:50:23 +0000 (22:50 +0800)]
librgw: make rgw file handle versioned

The reason that we need this is that there could be the following scenario:

1. rgw_setattr sets the file attr;
2. rgw_write writes some new data, and encodes its attr to store into rados;
3. before the actual persistence of the file's attr bl, rgw_lookup loads the file's
   previous attr and modifies the current file handle's metadata;
4. rgw_write's result persisted to rados;
5. rgw_setattr set the current file handle's metadata which is actually an old one to rados

In this case, the attr in rados would be out of date which means loss of data

Fixes: https://tracker.ceph.com/issues/50194
Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
(cherry picked from commit 49a35d72e0982c03781d4845c800332bded1c658)

3 years agorgw: Match decode_json with dump for default-placement in RGWZoneGroup. 45493/head
zhangzhiming [Wed, 26 Jan 2022 09:00:07 +0000 (17:00 +0800)]
rgw: Match decode_json with dump for default-placement in RGWZoneGroup.

Fixes: https://tracker.ceph.com/issues/54016
Signed-off-by: zhiming zhang <zhangzhm1@chinatelecom.cn>
(cherry picked from commit 45c448c49ed92f629dc07f755f2024715094fd69)

3 years agorgw: fix bad memory usage of bucket chown method 45490/head
Mohammad Fatemipour [Sun, 19 Dec 2021 18:33:55 +0000 (22:03 +0330)]
rgw: fix bad memory usage of bucket chown method

In RGWBucketCtl::chown we have one RGWObjectCtx for all objects of a bucket.
In RGWObjectCtx there is a cache mechanism (std::map) for states of objects that will grows
continuously. for buckets with millions of objects this mechanism leads to huge memory usage.

in chown process we really do not need this caching mechanism so we could create one RGWObjectCtx
for every 1000 objects to limit usage of memory.

Fixes: https://tracker.ceph.com/issues/53599
Signed-off-by: Mohammad Fatemipour <mohammad.fatemipour@sotoon.ir>
(cherry picked from commit cf2d83ef81458524715c23e802977dc0760c847f)

3 years agoosd/scrub: add scrub duration to pg stats 45471/head
Aishwarya Mathuria [Thu, 10 Mar 2022 10:25:37 +0000 (15:55 +0530)]
osd/scrub: add scrub duration to pg stats

Addition of a SCRUB_DURATION field that shows how long the scrub/deep-scrub of a pg took.
This field will be displayed in the output of the "ceph pg dump --format=json" and "ceph pg ls-by-pool --format=json" commands.

Signed-off-by: Aishwarya Mathuria <amathuri@redhat.com>
(cherry picked from commit be9f8a15cd490cad9b01556273abe56c2ed7162d)

3 years agoMerge pull request #45342 from benhanokh/wip-54523-quincy
Yuri Weinstein [Wed, 16 Mar 2022 20:44:02 +0000 (13:44 -0700)]
Merge pull request #45342 from benhanokh/wip-54523-quincy

quincy: OSD::Modify OSD Fast-Shutdown to work safely i.e. quiesce all activit…

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
3 years agoMerge pull request #45322 from ljflores/wip-54467-quincy
Yuri Weinstein [Wed, 16 Mar 2022 20:42:40 +0000 (13:42 -0700)]
Merge pull request #45322 from ljflores/wip-54467-quincy

quincy: osd: require osd_pg_max_concurrent_snap_trims > 0

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #45237 from k0ste/wip-54449-quincy
Yuri Weinstein [Wed, 16 Mar 2022 20:41:06 +0000 (13:41 -0700)]
Merge pull request #45237 from k0ste/wip-54449-quincy

quincy: mgr/prometheus: Added `avail_raw` field for Pools DF Prometheus mgr module

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #45193 from ronen-fr/wip-rf-45068-quincy
Yuri Weinstein [Wed, 16 Mar 2022 20:40:26 +0000 (13:40 -0700)]
Merge pull request #45193 from ronen-fr/wip-rf-45068-quincy

quincy: osd/scrub: stop sending bogus digest-update event messages

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agorgw: adding OPT_DATA_SYNC_RUN to gc_ops_list so that gc gets 45422/head
Pritha Srivastava [Tue, 1 Mar 2022 10:23:32 +0000 (15:53 +0530)]
rgw: adding OPT_DATA_SYNC_RUN to gc_ops_list so that gc gets
initialized for this command.

Fixes: https://tracker.ceph.com/issues/54433
Signed-off-by: Pritha Srivastava <prsrivas@redhat.com>
(cherry picked from commit 364f997e63030c28229757cf6221f8d3bf8b1686)

3 years agorgw: add OPT_BUCKET_SYNC_RUN to gc_ops_list, so that
Pritha Srivastava [Fri, 25 Feb 2022 11:00:46 +0000 (16:30 +0530)]
rgw: add OPT_BUCKET_SYNC_RUN to gc_ops_list, so that
gc is initialised and send_chain does not crash.

Also deleting objects inline in case gc is uninitialised.

Fixes: https://tracker.ceph.com/issues/54417
Signed-off-by: Pritha Srivastava <prsrivas@redhat.com>
(cherry picked from commit aa3006ea34e301148779f6055ee3fa045dabbf7e)

3 years agoqa/workunits/mon/pg_autoscaler.sh: clean up white space 45396/head
Kamoltat [Fri, 4 Mar 2022 16:40:07 +0000 (16:40 +0000)]
qa/workunits/mon/pg_autoscaler.sh: clean up white space

remove white space and weird indentations

Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit 4add5feff657aad2afbec56a21b544bedf7f9b95)

3 years agoqa/workunits/cephtool/test.sh: added test cases for target_size_ratio
Kamoltat [Fri, 4 Mar 2022 16:22:45 +0000 (16:22 +0000)]
qa/workunits/cephtool/test.sh: added test cases for target_size_ratio

Test the commands:

`osd pool create` <pool> --target_size_ratio <float>

`osd pool set` <pool> target_size_ratio <float>

`osd pool get` <pool> target_size_ratio

Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit 09785475f1af5050a7cae679566ac17629dfc584)

3 years agomon/OSDMonitor.cc: cannot set target_size_ratio to negative
Kamoltat [Fri, 4 Mar 2022 16:18:45 +0000 (16:18 +0000)]
mon/OSDMonitor.cc: cannot set target_size_ratio to negative

Throw an error when user set `target_size_ratio`
to negative using the command:

`osd pool set <pool> target_size_ratio <float>`

Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit 1b882054ba386d026a27fcd4f3b9f38e75a531cb)

3 years agomon/MonCommands.h: fix target_size_ratio range
Kamoltat [Thu, 17 Feb 2022 17:39:47 +0000 (17:39 +0000)]
mon/MonCommands.h: fix target_size_ratio range
The `target_size_ratio` should be 0.0 -> nolimit
not limited to 0.0 -> 1.0

Fixes: https://tracker.ceph.com/issues/54316
Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit e5a5b81cf05e25b1e4f35ad498c7a5a1c29a7e45)

Conflicts:
src/mon/MonCommands.h
 - don't add "name=yes_i_really_mean_it,type=CephBool,req=false"

3 years agoMerge pull request #45321 from kamoltat/wip-ksirivad-backport-quincy-fix-autoscale-doc
Kamoltat Sirivadhna [Mon, 14 Mar 2022 20:30:43 +0000 (16:30 -0400)]
Merge pull request #45321 from kamoltat/wip-ksirivad-backport-quincy-fix-autoscale-doc

quincy: doc/rados/operations/placement-groups: fix --bulk docs
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoinclude: Define dlfcn.h on Windows 45383/head
Lucian Petrut [Mon, 7 Mar 2022 08:12:23 +0000 (08:12 +0000)]
include: Define dlfcn.h on Windows

"dlfcn.h" is not available on Windows, so Ceph provides a drop-in
replacement through "dlfcn_compat.h".

The issue is that directly importing "dlfcn.h" fails at the moment,
for which reason we'll simply add a file called "dlfcn.h" that
includes "dlfcn_compat.h".

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
(cherry picked from commit 8b7432b9e914c47bbce74bf999e4c7aef57561e2)

3 years agoRevert "doc/dev: Running workunits locally" 45367/head
Matan Breizman [Thu, 3 Mar 2022 10:23:48 +0000 (10:23 +0000)]
Revert "doc/dev: Running workunits locally"

This reverts commit 7324abbe0122e02d11c09be4ea0f3899abc16bbd.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
(cherry picked from commit 67570c9bb18023bf1b5af27fff46737ff3c93caf)

3 years agomon, pybind/mgr: Add additional debug level logs for pool options 45363/head
Kamoltat [Wed, 2 Mar 2022 16:52:57 +0000 (16:52 +0000)]
mon, pybind/mgr: Add additional debug level logs for pool options

We find that these logs helped with
the debugging process issues like:
https://tracker.ceph.com/issues/54263.

Added debug level logs to `do_set_pool()` in
src/mon/OSDMonitor.cc.

Added debug level logs to `_maybe_adjust()` in
src/pybind/mgr/pg_autoscaler/module.py.

Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit ee40c2d2431825f36a01108115b3913112e2ef54)

3 years agoupgrade/pacific-x/parallel: Added mds.a and mds.b
Kamoltat [Mon, 28 Feb 2022 21:40:43 +0000 (21:40 +0000)]
upgrade/pacific-x/parallel: Added mds.a and mds.b

Added mds daemons so that it can create
cephFS pools and set options using
`do_set_pool()` in FSCommand.cc. Such that
we can cover corner cases like that in

https://tracker.ceph.com/issues/54263

Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit 1bc51f057fa80b3e34d8bac06ea22ea168fb8cf8)

3 years agoosd/osd_types: reorder pg_num_max
Kamoltat [Mon, 28 Feb 2022 21:38:34 +0000 (21:38 +0000)]
osd/osd_types: reorder pg_num_max

moved `pg_num_max` to be at the end of the
list in src/osd/osd_types.cc and
src/osd/osd_types.h.

Added comments to `opt_mapping` and `pool_opts_t`
about the importance of the order of options
in the list and class.

Fixes: https://tracker.ceph.com/issues/54263
Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit e44c469f59eaef18ecd3c3b348981939043eae02)

3 years agoceph-fuse: perform cleanup if test_dentry_handling failed 45331/head
Nikhilkumar Shelke [Wed, 2 Feb 2022 11:51:46 +0000 (17:21 +0530)]
ceph-fuse: perform cleanup if test_dentry_handling failed

If remount failed due to some reason then ceph_abort() is
getting called which causes child process termination
without cleanup.
To fix this issue, ceph_abort() call moved after
performing cleanup.

Fixes: https://tracker.ceph.com/issues/54049
Signed-off-by: Nikhilkumar Shelke <nshelke@redhat.com>
(cherry picked from commit 8c778e79840f1aa9b9731e2ef20881da0d122fda)

3 years agoos/bluestore: Fix problem with allocation desync 45342/head
Gabriel BenHanokh [Mon, 7 Mar 2022 15:36:34 +0000 (17:36 +0200)]
os/bluestore: Fix problem with allocation desync

Close window for possibility to capture allocator state and bluefs state
that are not in sync.

Signed-off-by: Gabriel Benhanokh <gbenhano@redhat.com>
(cherry picked from commit 8d052558bed4a9761c3b181253568a8686ee2df2)

3 years agoos/bluestore/bluefs: Fix sync compaction
Adam Kupczyk [Thu, 3 Mar 2022 14:39:00 +0000 (15:39 +0100)]
os/bluestore/bluefs: Fix sync compaction

Fixes problem with sync compaction (_rewrite_log_and_layout_sync).
There was a problem with not updating log_seq after compacting log.

It cause to stop _replay log right after first transaction.

... 20 bluefs _replay 0x0:  op_dir_create sharding
... 20 bluefs _replay 0x0:  op_dir_link  sharding/def to 21
... 20 bluefs _replay 0x0:  op_jump_seq 1025
... 10 bluefs _read h 0x555557c46400 0x1000~1000 from file(ino 1 size 0x1000 mtime 0.000000 allocated 410000 alloc_commit 410000 extents [1:0x1540000~410000])
... 20 bluefs _read left 0xff000 len 0x1000
... 20 bluefs _read got 4096
... 10 bluefs _replay 0x1000: stop: seq 1025 != expected 1026

This is a product of bluefs fine grain locks refactor.

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit 2f8e37064ca079c960929d7bb91e84fbf7f5cd47)

Conflicts:
src/test/objectstore/test_bluefs.cc
(cherry picked from commit 4fd98ce0359d6c3a36f08a3d87a78c3f0b65018d)

3 years agoosd: Modify OSD Fast-Shutdown to work safely
Gabriel BenHanokh [Mon, 7 Mar 2022 15:16:54 +0000 (17:16 +0200)]
osd: Modify OSD Fast-Shutdown to work safely

quiesce all activities and destage allocations to disk before killing the OSD

    1) keep the old (unsafe) fast-shutdown when we are not using NCB (non null-manager())
    2) skip service.prepare_to_stop() which can take as much as 10 seconds
    3) skip debug options in fast-shutdown
    4) set_state(STATE_STOPPING) which will stop accepting new tasks to this OSD
    5) clear op_shardedwq queues, this is safe since we didn't started processing them
    6) stop timer
    7) drain osd_op_tp (no new items will be added)
    8) now we can safely call umount which will close_db/bluefs and will destage allocation to disk
    9) skip _shutdown_cache() when we are in the middle of a fast-shutdown
    10) increase debug level on fast-shutdown
    11) add option for bluestore_qfsck_on_mount to force scan on mount for all tests
    12) disable fsck-on-umount when running fast-shutdown
    13) add an option to increase debug level at fast-shutdown umount()
    14) set a time limit to fast-shutdown

    15) Bug-Fix BlueStore::pool_statfs don't access db after it was removed
    16) Fix error message for qfsck (error was caused by PR https://github.com/ceph/ceph/pull/44563)

    17) make shutdown-timeout configurable

Fixes: https://tracker.ceph.com/issues/53266
Signed-off-by: Gabriel Benhanokh <gbenhano@redhat.com>
(cherry picked from commit 9b2a64a5f6ea743b2a4f4c2dbd703248d88b2a96)

3 years agoosd: require osd_pg_max_concurrent_snap_trims > 0 45322/head
Dan van der Ster [Thu, 24 Feb 2022 08:42:00 +0000 (09:42 +0100)]
osd: require osd_pg_max_concurrent_snap_trims > 0

If osd_pg_max_concurrent_snap_trims is zero, we mistakenly clear
the snaptrim queue. Require it to be > 0.

Fixes: https://tracker.ceph.com/issues/54396
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
(cherry picked from commit 29545b617b3b0324f9b0b20e032e3e38557115eb)