Brad Hubbard [Tue, 22 Sep 2020 01:29:06 +0000 (11:29 +1000)]
test: Add support for fedora 32, 33 and Ubuntu 20.04
Enable creation and use of these OS images in docker-test.
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
Brad Hubbard [Tue, 22 Sep 2020 02:51:28 +0000 (12:51 +1000)]
tests: Make sure install-deps is run noninteractively
This gets past things like tzconfig stopping for user input.
Remove redundant install of python-virtualenv.
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
Brad Hubbard [Tue, 22 Sep 2020 00:30:01 +0000 (10:30 +1000)]
test/centos-8: Install git before running install-deps
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
Brad Hubbard [Tue, 22 Sep 2020 00:28:09 +0000 (10:28 +1000)]
test/docker-test: Fix permissions issue when using podman
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
Patrick Donnelly [Mon, 21 Sep 2020 22:26:22 +0000 (15:26 -0700)]
Merge PR #36776 into master
* refs/pull/36776/head:
systemd: Support Graceful Reboot for AIO Node
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: David Disseldorp <ddiss@suse.de>
Patrick Donnelly [Mon, 21 Sep 2020 21:04:06 +0000 (14:04 -0700)]
Merge PR #37227 into master
* refs/pull/37227/head:
qa/cephfs: don't pass args to destroy() in recreate()
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Mon, 21 Sep 2020 21:03:32 +0000 (14:03 -0700)]
Merge PR #37233 into master
* refs/pull/37233/head:
qa/mgr: revert a patch from commit
04ed58f
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Samuel Just [Mon, 21 Sep 2020 17:57:43 +0000 (10:57 -0700)]
Merge pull request #37271 from cyx1231st/wip-seastore-fix-non-repeatable-read
crimson/seastore: fix potential non-repeatable-read from RootBlock
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
Ilya Dryomov [Mon, 21 Sep 2020 17:39:30 +0000 (19:39 +0200)]
Merge pull request #36927 from idryomov/wip-krbd-noudev
krbd: optionally skip waiting for udev events
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Sébastien Han <seb@redhat.com>
Patrick Donnelly [Mon, 21 Sep 2020 16:48:28 +0000 (09:48 -0700)]
Merge PR #37213 into master
* refs/pull/37213/head:
mgr/rook: Pass pod namespace to list_namespaced_pod()
Reviewed-by: Travis Nielsen <tnielsen@redhat.com>
Lenz Grimmer [Mon, 21 Sep 2020 15:58:03 +0000 (17:58 +0200)]
Merge pull request #37101 from LenzGr/master-documentation
doc: Updated `HACKING.rst` and `README.rst`
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
Lenz Grimmer [Mon, 21 Sep 2020 12:33:19 +0000 (14:33 +0200)]
Merge pull request #37137 from tspmelo/wip-fix-iscsi-tests
mgr/dashboard: Fix iSCSI backend unit-test
Reviewed-by: Kiefer Chang <kiefer.chang@suse.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
Jason Dillaman [Mon, 21 Sep 2020 12:19:08 +0000 (08:19 -0400)]
Merge pull request #37262 from trociny/wip-rbd-nbd-quiesce-hook
rbd-nbd: fix typo in mini help
Reviewed-by: Jason Dillaman <dillaman@rehdat.com>
Lenz Grimmer [Mon, 21 Sep 2020 11:59:48 +0000 (13:59 +0200)]
Merge pull request #35785 from rhcs-dashboard/wip-45957-consolidate_Osd_Endpoints
mgr/dashboard: Consolidate Osd mark endpoints
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Ilya Dryomov [Wed, 16 Sep 2020 14:38:10 +0000 (16:38 +0200)]
qa: add test for mapping and unmapping from a network namespace
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Guillaume Abrioux [Mon, 21 Sep 2020 11:41:41 +0000 (13:41 +0200)]
Merge pull request #37234 from guits/guits-quick-fix
ceph-volume: fix wrong type passed in terminal.warning()
Mykola Golub [Mon, 21 Sep 2020 11:05:21 +0000 (14:05 +0300)]
Merge pull request #37222 from dillaman/wip-librbd-image-dispatch
librbd: bug fixes and cleanup for IO dispatch path
Reviewed-by: Mykola Golub <mgolub@suse.com>
Kefu Chai [Mon, 21 Sep 2020 09:49:08 +0000 (17:49 +0800)]
Merge pull request #37215 from uweigand/fix-librados-test-endian
test/librados: fix endian bugs in checksum test cases
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Lenz Grimmer [Mon, 21 Sep 2020 09:48:28 +0000 (11:48 +0200)]
Merge pull request #36900 from wjwithagen/wjw-enhance-mgr_module.py
mgr/dashboard: Report the missing path in error message
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
Lenz Grimmer [Mon, 21 Sep 2020 09:45:19 +0000 (11:45 +0200)]
Merge pull request #37087 from tspmelo/wip-iscsi-logged-in
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Ricardo Marques <rimarques@suse.com>
Kefu Chai [Mon, 21 Sep 2020 09:43:17 +0000 (17:43 +0800)]
Merge pull request #37261 from tchaikov/wip-47552
common/BackTrace: do not use len for length of demangled symbol
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Kefu Chai [Mon, 21 Sep 2020 09:38:56 +0000 (17:38 +0800)]
Merge pull request #37185 from david-z/wip-fix-osdmaptool
tools/osdmaptool.cc: fix inaccurate pg map result when simulating osd out
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Kefu Chai [Mon, 21 Sep 2020 09:37:21 +0000 (17:37 +0800)]
Merge pull request #37210 from changchengx/no_tune_message
messenger: remove unused variable
Reviewed-by: Kefu Chai <kchai@redhat.com>
Lenz Grimmer [Mon, 21 Sep 2020 08:42:19 +0000 (10:42 +0200)]
Merge pull request #37183 from rhcs-dashboard/fix-47434-master
mgr/dashboard: table detail rows overflow
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Guillaume Abrioux [Fri, 18 Sep 2020 11:51:51 +0000 (13:51 +0200)]
ceph-volume: fix wrong type passed in terminal.warning()
`terminal.warning()` excepts a `str`.
Passing `e` means we pass a type `exceptions.RuntimeError`
Changing to `terminal.warning(e.message)` fixes the issue.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1877672
Resolves: rhbz#
1877672
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Kefu Chai [Mon, 21 Sep 2020 06:41:35 +0000 (14:41 +0800)]
Merge pull request #37268 from anthonyeleven/anthonyeleven/doc-fixes
doc/man: Add optional reweight-by-utilization args
Reviewed-by: Zac Dover <zac.dover@gmail.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Anthony D'Atri [Mon, 21 Sep 2020 00:37:58 +0000 (17:37 -0700)]
doc/man: Add optional reweight-by-utilization args
doc/mgr: Grammar and wording for Prometheus labels
doc/rados: Spelling and clarity
Signed-off-by: Anthony D'Atri <anthony.datri@gmail.com>
Yingxin Cheng [Fri, 18 Sep 2020 08:55:03 +0000 (16:55 +0800)]
crimson/seastore: fix potential non-repeatable-read from RootBlock
Load root block into the transaction when read it.
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
Brad Hubbard [Mon, 21 Sep 2020 01:34:58 +0000 (11:34 +1000)]
Merge pull request #37176 from badone/wip-enable-mgr-client-debug
qa: Enable debug_client for mgr tests
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Changcheng Liu [Thu, 17 Sep 2020 05:15:44 +0000 (13:15 +0800)]
messenger: remove unused variable
Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
Mykola Golub [Sun, 20 Sep 2020 07:19:30 +0000 (08:19 +0100)]
rbd-nbd: fix typo in mini help
Signed-off-by: Mykola Golub <mgolub@suse.com>
Kefu Chai [Sun, 20 Sep 2020 03:30:26 +0000 (11:30 +0800)]
common/BackTrace: do not use len for length of demangled symbol
it turns out `len` is longer than the length of demangled symbol,
let's rely on the `\0` sentry in the returned char* string instead.
in this change, use `status` to tell if the demangle is successful or
not.
Fixes: https://tracker.ceph.com/issues/47552
Signed-off-by: Kefu Chai <kchai@redhat.com>
Ilya Dryomov [Sat, 19 Sep 2020 09:36:43 +0000 (11:36 +0200)]
Merge pull request #37072 from idryomov/wip-kcephfs-blacklisted-string
mds: add " (blacklisted)" to session reject error string
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Kefu Chai [Sat, 19 Sep 2020 04:08:35 +0000 (12:08 +0800)]
Merge pull request #37207 from tchaikov/wip-doc-dev-osx
doc/dev/macos.rst: disable features not supported on osx
Reviewed-by: Kamoltat Sirivadhna <ksirivad@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Kefu Chai [Sat, 19 Sep 2020 03:07:52 +0000 (11:07 +0800)]
Merge pull request #37252 from pponnuvel/spellcheck-docs
doc: Fixed a number of typos in documentation
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 19 Sep 2020 02:57:58 +0000 (10:57 +0800)]
Merge pull request #37216 from tchaikov/wip-doc-cephadmin-codeblock
doc/cephadm: use appropriate directive for formatting codeblocks
Reviewed-by: Zac Dover <zac.dover@gmail.com>
Patrick Donnelly [Sat, 19 Sep 2020 01:54:57 +0000 (18:54 -0700)]
Merge PR #37202 into master
* refs/pull/37202/head:
mon: allow overriding the initial mon_host
Reviewed-by: Neha Ojha <nojha@redhat.com>
Kefu Chai [Sat, 19 Sep 2020 01:33:31 +0000 (09:33 +0800)]
Merge pull request #37224 from tchaikov/wip-cmake-boost-MPL-list-size
cmake: introduce Boost::MPL interface library for increasing BOOST_MPL_LIMIT_LIST_SIZE
Reviewed-by: David Zafman <dzafman@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Patrick Donnelly [Sat, 19 Sep 2020 00:29:27 +0000 (17:29 -0700)]
Merge PR #37214 into master
* refs/pull/37214/head:
mgr/volumes/nfs: Check if orchestrator spec service_id is valid
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Sat, 19 Sep 2020 00:27:09 +0000 (17:27 -0700)]
Merge PR #37190 into master
* refs/pull/37190/head:
mon/MonCap: check profile_grants too while checking caps
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Sat, 19 Sep 2020 00:24:38 +0000 (17:24 -0700)]
Merge PR #37148 into master
* refs/pull/37148/head:
mds/FSMap: do not set legacy_client_fscid after filtering
Reviewed-by: Rishabh Dave <ridave@redhat.com>
Patrick Donnelly [Sat, 19 Sep 2020 00:23:32 +0000 (17:23 -0700)]
Merge PR #37037 into master
* refs/pull/37037/head:
mds: fix purge_queue's _calculate_ops is inaccurate
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Fri, 18 Sep 2020 23:25:59 +0000 (16:25 -0700)]
Merge PR #37218 into master
* refs/pull/37218/head:
qa: spawn MDS daemons before creating fs
Reviewed-by: Kefu Chai <kchai@redhat.com>
Neha Ojha [Fri, 18 Sep 2020 21:31:45 +0000 (14:31 -0700)]
Merge pull request #35906 from gregsfortytwo/wip-stretch-mode
Add a new stretch mode for 2-site Ceph clusters
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Patrick Donnelly [Fri, 18 Sep 2020 18:04:11 +0000 (11:04 -0700)]
Merge PR #36957 into master
* refs/pull/36957/head:
mds: convert stringstream to CachedStackStringStream
Reviewed-by: Kotresh Hiremath Ravishankar <khiremat@redhat.com>
Michael Fritch [Fri, 18 Sep 2020 17:50:35 +0000 (11:50 -0600)]
Merge pull request #37245 from mgfritch/cephadm-extra-ceph-conf-test
mgr/cephadm: fixup expected extra ceph conf test result
Reviewed-by: Adam King <adking@redhat.com>
Ponnuvel Palaniyappan [Fri, 18 Sep 2020 17:12:07 +0000 (18:12 +0100)]
doc: Fixed a number of typos in documentation
Signed-off-by: Ponnuvel Palaniyappan <pponnuvel@gmail.com>
Michael Fritch [Fri, 18 Sep 2020 14:54:00 +0000 (08:54 -0600)]
mgr/cephadm: fixup expected extra ceph conf test result
fix test failure introduced by:
ff7e76348e5457fa6acb23545fcef56d6640c50a
```
E AssertionError: expected call not found.
E Expected: _run_cephadm('test', 'mon.test', 'deploy', ['--name', 'mon.test', '--reconfig', '--config-json', '-'], stdin='{"config": "\\n\\n[mon]\\nk=v\\n", "keyring": ""}')
E Actual: _run_cephadm('test', 'mon.test', 'deploy',
['--name', 'mon.test', '--reconfig', '--config-json', '-'],
stdin='{"config": "\\n\\n[mon]\\nk=v\\n", "keyring": ""}', image='')
```
Signed-off-by: Michael Fritch <mfritch@suse.com>
Lenz Grimmer [Fri, 18 Sep 2020 13:15:34 +0000 (15:15 +0200)]
doc: Updated `HACKING.rst` and `README.rst`
Replaced the content of `HACKING.rst` in the dashboard source code
directory with a pointer to the new location in the developer guide.
Updated references in `README.rst` to also point to the online versions
of these files.
Fixes: tracker.ceph.com/issues/47396
Signed-off-by: Lenz Grimmer <lgrimmer@suse.com>
Rishabh Dave [Fri, 18 Sep 2020 11:03:41 +0000 (16:33 +0530)]
qa/mgr: revert a patch from commit
04ed58f
mds_cluster.mds_fail() runs command "mds fail" not "fs fail". The reason
for failure was PR #32581 which accidentally changed the return code
from 0 to EINVAL. Since this was reversed in PR #37159, the change
introduced by
04ed58f is not only incorrect but also redundant.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Lenz Grimmer [Fri, 18 Sep 2020 10:59:30 +0000 (12:59 +0200)]
Merge pull request #34545 from rhcs-dashboard/read_only
mgr/dashboard: Disabling the form inputs for the read_only modals
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
Lenz Grimmer [Fri, 18 Sep 2020 09:58:47 +0000 (11:58 +0200)]
Merge pull request #37023 from p-se/grafana-many-to-many
mgr/dashboard: Fix many-to-many issue in host-details Grafana dashboard
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Joshua Schmid [Fri, 18 Sep 2020 08:57:52 +0000 (10:57 +0200)]
Merge pull request #37059 from guits/guits-cephadm-shell-multiple-mounts
cephadm: support multiple mounts when running interactive shell
Joshua Schmid [Fri, 18 Sep 2020 08:56:24 +0000 (10:56 +0200)]
Merge pull request #36890 from sebastian-philipp/cephadm-extend-ceph.conf
mgr/cephadm: Add extra-ceph-conf
Joshua Schmid [Fri, 18 Sep 2020 08:55:17 +0000 (10:55 +0200)]
Merge pull request #37135 from sebastian-philipp/cephadm-race-add-host-vs-apply
mgr/cephadm: Fix race between host_add and _apply_all_specs
Joshua Schmid [Fri, 18 Sep 2020 08:50:30 +0000 (10:50 +0200)]
Merge pull request #36969 from votdev/issue_46666_container_spec
cephadm: Introduce 'container' specification to deploy custom containers
Rishabh Dave [Fri, 18 Sep 2020 08:18:33 +0000 (13:48 +0530)]
qa/cephfs: don't pass args to destroy() in recreate()
In filesystem.py, don't set value of reset_obj_attrs to False.
Fixes: https://tracker.ceph.com/issues/47526
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Rishabh Dave [Wed, 16 Sep 2020 10:59:24 +0000 (16:29 +0530)]
mon/MonCap: check profile_grants too while checking caps
When checking if a certain fs subcommand can and should be executed in
FSCommands.cc, check permissions in "profile_grants" too when the caps
for that entity contains a cap profile.
Fixes: https://tracker.ceph.com/issues/47423
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Kefu Chai [Fri, 18 Sep 2020 07:18:31 +0000 (15:18 +0800)]
Merge pull request #37158 from tchaikov/wip-no-more-assertDictContainsSubset
mgr/dashboard: replace assertDictContainsSubset() with assertLessEqual()
Reviewed-by: Volker Theile <vtheile@suse.com>
Kefu Chai [Fri, 18 Sep 2020 05:42:44 +0000 (13:42 +0800)]
Merge pull request #37170 from yaarith/add-smartctl-nvme-dependencies
ceph.spec, debian: add smartmontools, nvme-cli dependencies
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 18 Sep 2020 04:23:13 +0000 (12:23 +0800)]
cmake: introduce Boost::MPL interface library
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 18 Sep 2020 03:01:19 +0000 (11:01 +0800)]
src: Revert "Fix to raise BOOST_MPL_LIMIT_LIST_SIZE from 20 to 30"
This reverts commit
3f4e9a4526b8f174888828078e610769b80e48ec.
will fix the FTBFS by introducing a interface library in CMake script
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 18 Sep 2020 04:17:44 +0000 (12:17 +0800)]
cmake: extract admin/CMakeLists.txt
for better modularity
Signed-off-by: Kefu Chai <kchai@redhat.com>
Jason Dillaman [Fri, 4 Sep 2020 02:21:30 +0000 (22:21 -0400)]
librbd: pass IOContext to image-extent IO dispatch methods
This allows a specific IOContext to be used regardless of the image's
current read and write snapshot state.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Fri, 4 Sep 2020 00:22:30 +0000 (20:22 -0400)]
librbd: pass IOContext to object-extent IO dispatch methods
This allows a specific IOContext to be used regardless of the image's
current read and write snapshot state.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Thu, 3 Sep 2020 17:33:42 +0000 (13:33 -0400)]
librbd: helper method to create new data pool IOContext
Deep-copy will require the ability to issue IOs against arbitrary
IOContexts via the image-extent IO dispatcher.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Tue, 1 Sep 2020 15:17:41 +0000 (11:17 -0400)]
librbd: image dispatch spec tids are assigned by dispatcher
This was a legacy implementation where it was assigned by the ImageRequestWQ
and therefore needs to be part of the factory methods.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Mon, 31 Aug 2020 22:08:24 +0000 (18:08 -0400)]
librbd: simplify in-flight IO tracking for write-block image dispatch
Now that we don't need to worry about read requests issuing a finish
callback, we can use a simple counter to track in-flight writes.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Mon, 31 Aug 2020 22:07:14 +0000 (18:07 -0400)]
librbd: drop ImageDispatchInterface::handle_finished virtual method
Any dispatch layer can now directly place themselves in the finish
callback handler chain without the use of the generic callback.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Mon, 31 Aug 2020 22:27:32 +0000 (18:27 -0400)]
librbd: use an overridable finish handler for the image dispatcher
This mimics the design from the object dispatcher and will allow
for simplified in-flight IO tracking.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Thu, 17 Sep 2020 23:43:45 +0000 (19:43 -0400)]
librbd: drop flush tracker from exclusive lock image dispatch
We can now pass the flush through the exclusive-lock dispatch layer
to ensure all in-flight IOs have been processed.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Thu, 17 Sep 2020 17:05:27 +0000 (13:05 -0400)]
librbd: update refresh image dispatch layer flush exclusions
Only flush requests coming from the refresh state machine or from the
exclusive-lock dispatch layer initializationshould be ignored. This is
because both can be initiated from the refresh state machine and
therefore deadlock.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Thu, 17 Sep 2020 20:22:20 +0000 (16:22 -0400)]
librbd: reorder exclusive-lock pre-release state steps
The exclusive-lock dispatch layer should be locked and flushed to
ensure no IO is waiting for a refresh. Once that is complete, interlock
with the refresh state machine and re-flush one last time w/ the
refresh dispatch layer skipped.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Thu, 17 Sep 2020 19:09:39 +0000 (15:09 -0400)]
librbd: avoid blocking writes when initializing exclusive-lock
The exclusive-lock dispatch layer will already block IOs as required
so this second layer of blocking just increases the complexity and
the potential for deadlocks when attempting to flush.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Thu, 17 Sep 2020 18:52:30 +0000 (14:52 -0400)]
librbd: skip flush from exclusive-lock dispatch layer on init/shutdown
If the exclusive-lock layer is being initialized/shut down at image
open/close, there is no IO flowing so there is no need to flush.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Thu, 17 Sep 2020 13:51:49 +0000 (09:51 -0400)]
librbd: assign a unique flush source to each internal component
This will allow improved tracking and bypassing of a flush request
that might cause IO deadlocks in dispatch layers.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Yaarit Hatuka [Fri, 18 Sep 2020 03:25:56 +0000 (03:25 +0000)]
ceph.spec.in, debian/control: add smartmontools and nvme-cli dependencies
These packages are needed in order to scrape device health metrics from
devices used by OSD and MON daemons.
smartmontools' smartctl is what we use in order to scrape devices' SMART
attributes and general health metrics.
In addition, we use nvme-cli tool on NVMe devices, which fetches
vendor specific NVMe related health metrics.
Ceph rely on these tools for proper functioning of the underlying layers
of devicehealth mgr module, and other mgr modules which use devicehealth
functionality (such as diskprediction_local, telemetry, dashboard).
Essentially, most of devicehealth commands rely on proper functioning of
smartctl, otherwise they lack the device health metrics.
For example, in case smartctl is missing, the commands:
ceph device scrape-daemon-health-metrics <who>
ceph device scrape-health-metrics [<devid>]
will not be able to scrape health metrics, and the command:
ceph device predict-life-expectancy <devid>
will not provide any meaningful output (since there are no metrics).
In short, when we scrape a device by its daemon (be it an OSD or a MON):
ceph device scrape-daemon-health-metrics <who>
The devicehealth module command eventually invokes a
block_device_get_metrics() call in either osd/OSD.cc or mon/Monitor.cc,
which wraps calls to both
block_device_run_smartctl() (spawns smartctl)
block_device_run_vendor_nvme() (spawns nvme)
in common/blkdev.cc.
Minimum version requirements:
'smartmontools' is the package name, which contains two utility
programs: 'smartd' and 'smartctl'. Ceph uses the latter.
Version 6.7 of smartctl first introduced the --json option (beta), which
allows to output the metrics in a JSON format. Since then a few
adjustments were made and the feature officially launched in smartctl
version 7.0.
Since we rely on the JSON format to process the metrics, we must have
smartmontools' smartctl version >= 7.
That said, we choose not to specify smartmontools version here on
purpose, since there might be a scenario where:
We specified smartmontools version to be >= 7.
smartmontools 7 is not available yet in rhel 8 / centos 8.
A user installs via rpm ceph-osd, for example.
smartmontools will not be installed (since version >= 7 is not available
in this repo yet).
Then the user upgrades to 8.3 (which should have smartmontools >= 7),
but smartmontools will not get upgraded (since it's not installed).
In the scenario where we do not specify a version, smartmontools 6.6
will be installed, but it will be upgraded to >= 7 when a user upgrades
(and if it's a fresh installation - version >= 7 would be installed
anyway).
nvme-cli does not have a minimum version.
We use 'Recommends' for both rpm and deb packages since we do not want
the installation to fail in case of conflicts. 'Recommends' weakens the
dependency to be installed in case possible, but ignores it in cases of
conflicts with other dependencies.
It's worth mentioning that smartmontools and nvme-cli dependencies exist
in ceph-container builds.
We add them here for the cases of bare metal installations.
In the future we will add a separate package (with smartmontools and
nvme-cli dependencies) that can be installed on any node (running
rbd-mirror, rgw, mds, mgr, etc.), in order to be able to collect the
health metrics of its devices and offer their life expectancy
prediction.
Fixes: https://tracker.ceph.com/issues/47479
Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
Wong Hoi Sing Edison [Tue, 25 Aug 2020 04:16:54 +0000 (12:16 +0800)]
systemd: Support Graceful Reboot for AIO Node
Ceph AIO installation with single/multiple node is not friendly for
loopback mount, especially always get deadlock issue during graceful
system reboot.
We already have `rbdmap.service` with graceful system reboot friendly as
below:
[Unit]
After=network-online.target
Before=remote-fs-pre.target
Wants=network-online.target remote-fs-pre.target
[Service]
ExecStart=/usr/bin/rbdmap map
ExecReload=/usr/bin/rbdmap map
ExecStop=/usr/bin/rbdmap unmap-all
This PR introduce:
- `ceph-mon.target`: Ensure startup after `network-online.target` and
before `remote-fs-pre.target`
- `ceph-*.target`: Ensure startup after `ceph-mon.target` and before
`remote-fs-pre.target`
- `rbdmap.service`: Once all `_netdev` get unmount by
`remote-fs.target`, ensure unmap all RBD BEFORE any Ceph components
under `ceph.target` get stopped during shutdown
The logic is concept proof by
<https://github.com/alvistack/ansible-role-ceph_common/tree/develop>;
also works as expected with Ceph + Kubernetes deployment by
<https://github.com/alvistack/ansible-collection-kubernetes/tree/develop>.
No more deadlock happened during graceful system reboot, both AIO
single/multiple no de with loopback mount.
Also see:
- <https://github.com/ceph/ceph/pull/36776>
- <https://github.com/etcd-io/etcd/pull/12259>
- <https://github.com/cri-o/cri-o/pull/4128>
- <https://github.com/kubernetes/release/pull/1504>
Fixes: https://tracker.ceph.com/issues/47528
Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com>
Patrick Donnelly [Wed, 2 Sep 2020 23:51:50 +0000 (16:51 -0700)]
mds: convert stringstream to CachedStackStringStream
This is a simple performance refactor.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Fri, 18 Sep 2020 01:30:11 +0000 (18:30 -0700)]
Merge PR #37163 into master
* refs/pull/37163/head:
mds: silence warning ‘MDSRank::fs_name’ will be initialized after [-Wreorder]
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Fri, 18 Sep 2020 01:28:13 +0000 (18:28 -0700)]
Merge PR #37147 into master
* refs/pull/37147/head:
mds/FSMap: check parse_role return before filtering
Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
David Zafman [Fri, 18 Sep 2020 00:45:42 +0000 (17:45 -0700)]
Merge pull request #36989 from AmnonHanuhov/wip-ObjectStore_EIO_Handling
osd: Got rid of global flag eio_errors_to_process
Reviewed-by: David Zafman <dzafman@redhat.com>
Jason Dillaman [Mon, 31 Aug 2020 20:04:34 +0000 (16:04 -0400)]
librbd: remove unncessary templating from io::ImageDispatchSpec
This was a remnant of the original implimentation for the image
dispatch spec. Now it more closely aligns with the object dispatch
spec.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Mon, 31 Aug 2020 22:25:20 +0000 (18:25 -0400)]
librbd: queued IOs should retry acquiring exclusive lock
If the IO that attempts to acquire the exclusive lock fails,
any queued IO will not be retried leading to a deadlock.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
David Zafman [Fri, 18 Sep 2020 00:30:43 +0000 (17:30 -0700)]
Merge pull request #36397 from dzafman/wip-39012
distinguish unfound + impossible to find, vs start some down OSDs to get
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Jason Dillaman [Thu, 17 Sep 2020 23:04:29 +0000 (19:04 -0400)]
Merge pull request #37132 from lixiaoy1/dirty_cache_feature
librbd: add DIRTY_CACHE in IMPLICIT_ENABLE
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Thu, 17 Sep 2020 21:52:30 +0000 (17:52 -0400)]
Merge pull request #36586 from MahatiC/wip-ssd-integration
librbd/cache: SSD cache integration framework
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Kefu Chai [Thu, 17 Sep 2020 16:32:22 +0000 (00:32 +0800)]
Merge pull request #37141 from sebastian-philipp/cephadm-fix-rm-util.load_from_store
mgr/cephadm: fix RemoveUtil.load_from_store()
Reviewed-by: Joshua Schmid <jschmid@suse.de>
Patrick Donnelly [Thu, 17 Sep 2020 16:01:33 +0000 (09:01 -0700)]
qa: spawn MDS daemons before creating fs
This avoids unnecessary MDS_ALL_DOWN messages because the MDS daemons
have not yet been spawned.
Fixes: https://tracker.ceph.com/issues/47518
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Kefu Chai [Fri, 11 Sep 2020 23:04:14 +0000 (07:04 +0800)]
doc/cephadm: use appropriate directive for formatting codeblocks
Signed-off-by: Kefu Chai <kchai@redhat.com>
Ulrich Weigand [Thu, 17 Sep 2020 13:52:54 +0000 (15:52 +0200)]
test/librados: fix endian bugs in checksum test cases
We're seeing test failures when running rados/test.sh in Teuthology
on a big-endian platform (IBM Z). These are all related to calls
to the checksum operations, which expect little-endian inputs and
outputs, but are in many places called with native-endian types
from the test code.
One test case, LibRadosAio::RoundTrip3 in aio.cc, already uses
ceph_le types to address this problem, and this test actually
completes successfully on IBM Z. This patch changes the other
test case performing checksum operations accordingly.
With this patch in place, rados/test.sh now completed successfully.
Fixes: https://tracker.ceph.com/issues/47516
Signed-off-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Patrick Donnelly [Thu, 17 Sep 2020 13:42:33 +0000 (06:42 -0700)]
Merge PR #37197 into master
* refs/pull/37197/head:
doc: add "fs authorize" subcommand to ceph man page
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Kefu Chai [Thu, 17 Sep 2020 10:40:06 +0000 (18:40 +0800)]
Merge pull request #37120 from tchaikov/wip-rados-type-hintings
pybind/rados: add more type hintings
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Varsha Rao [Thu, 17 Sep 2020 10:39:52 +0000 (16:09 +0530)]
mgr/volumes/nfs: Check if orchestrator spec service_id is valid
Fixes: https://tracker.ceph.com/issues/47512
Signed-off-by: Varsha Rao <varao@redhat.com>
Kefu Chai [Thu, 17 Sep 2020 10:39:09 +0000 (18:39 +0800)]
Merge pull request #37143 from dvanders/dvanders_flush
ceph.in: ignore failures to flush stdout
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Thu, 17 Sep 2020 10:38:15 +0000 (18:38 +0800)]
Merge pull request #37100 from rhcs-dashboard/fix-47400-master
ceph: ignore BrokenPipeError when printing help
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Thu, 17 Sep 2020 10:16:06 +0000 (18:16 +0800)]
Merge pull request #37045 from tchaikov/wip-crimson-bt
common/BackTrace: extract demangle() out
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Volker Theile [Wed, 16 Sep 2020 10:06:36 +0000 (12:06 +0200)]
cephadm: Introduce 'container' specification to deploy custom containers
Fixes: https://tracker.ceph.com/issues/46666
Signed-off-by: Volker Theile <vtheile@suse.com>
Varsha Rao [Thu, 17 Sep 2020 06:59:40 +0000 (12:29 +0530)]
mgr/rook: Pass pod namespace to list_namespaced_pod()
As list_namespaced_pod method requires pod namespace instead of cluster name.
Fixes: https://tracker.ceph.com/issues/47511
Signed-off-by: Varsha Rao <varao@redhat.com>
Kefu Chai [Wed, 9 Sep 2020 00:42:24 +0000 (08:42 +0800)]
common/BackTrace: let abi::__cxa_demangle() do the malloc
also use the returned length for constructing the string_view to be
appended.
we could reuse the buffer across multiple demangle() call for saving the
calls to malloc()/free(). but the upside of this change is that it's
simpler.
Signed-off-by: Kefu Chai <kchai@redhat.com>