Sage Weil [Tue, 10 Aug 2021 20:47:34 +0000 (16:47 -0400)]
Merge PR #42318 into master
* refs/pull/42318/head:
mgr/rook: update DefaultFetcher device path to look at local and fix bug
mgr/rook: add node and PV name information to Device in DefaultFetcher
mgr/rook: fix typing errors in Fetcher classes
mgr/rook: create and use DefaultFetcher and LSOFetcher classes
mgr/rook: create KubernetesCustomResource class to fetch CRs
mgr/rook: fix device ls error handling
mgr/rook: change storage class module option name and default value
mgr/rook: fix typing errors related to storage_class_name and device ls
mgr/rook: make `device ls` only display pvs in specified storage class
mgr/rook: add StorageV1Api and storage_class_name to RookCluster
mgr/rook: add StorageV1Api to RookOrchestrator
mgr/rook: add mgr/rook/storage_class_name to ceph config
mgr/rook: ceph orch device ls fetch and display info about PVs
mgr/rook: add CustomObjectsApi to RookCluster
Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
Sage Weil [Tue, 10 Aug 2021 20:37:38 +0000 (16:37 -0400)]
Merge PR #42691 into master
* refs/pull/42691/head:
mgr/nfs: add --port to 'nfs cluster create' and port to 'nfs cluster info'
qa/suites/orch/cephadm/smoke-roleless: test taking ganeshas offline
qa/tasks/vip: exec with bash -ex
qa/suites/orch/cephadm: separate test_nfs from test_orch_cli
Sage Weil [Tue, 10 Aug 2021 14:36:37 +0000 (10:36 -0400)]
Merge PR #42680 into master
* refs/pull/42680/head:
src/pybind/mgr/nfs/tests: pass cluster_id to from_export_block()
src/pybind/mgr/nfs: remove `tag` option
src/pybind/mgr/nfs: remove per daemon config test
src/pybind/mgr/nfs: directly use cluster_id and remove daemon related stuff
script: run-cbt.sh tests crimson with CyanStore instead of MemStore.
These tests were always supposed to run against CyanStore. However,
commit e6ed65db8b4e0a2f8026c2e35a12dd292c5f2b8c (PR #42437) changed
the meaning of `--memstore` and introduced `--cyanstore` to be used
instead. This commit makes `run-cbt.sh` aware about the new switch.
This PR updates the text in the RADOS Guide
(the Ceph Storage Cluster Guide) that appears
at the beginning of the "Storage Devices"
chapter. I did the following:
- rewrote some of the sentences so that
they read more like written text than like
spoken language
- added "Ceph Manager" to the list of daemons
that a Ceph cluster comprises
- that's about it.
Sage Weil [Mon, 9 Aug 2021 18:15:28 +0000 (14:15 -0400)]
cephadm: fix container name detection
'enter' was broken because we weren't correctly identifying the container
name. Strip the newline from the inspect result so that we can reliably
match against the 'running' state.
Neha Ojha [Mon, 9 Aug 2021 14:35:01 +0000 (14:35 +0000)]
qa/suites/rados/perf/ceph.yaml: remove rgw
This is no longer required because we removed cosbench workloads in fd350fd0150a2d4072f055658c20314a435a19ba. This is also required to prevent
failures like the following or any other changes that break the rgw task:
```
2021-08-06T20:13:25.812 INFO:teuthology.orchestra.run.smithi060.stderr:curl: (7) Failed to connect to smithi060.front.sepia.ceph.com port 80: Connection refused
2021-08-06T20:15:33.813 ERROR:teuthology.contextutil:Saw exception from nested tasks
Traceback (most recent call last):
File "/home/teuthworker/src/git.ceph.com_git_teuthology_04c2febe7099917d97a71271f17abb5710030132/teuthology/contextutil.py", line 31, in nested
vars.append(enter())
File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
return next(self.gen)
File "/home/teuthworker/src/github.com_ceph_ceph-c_3c0f8c8164075af7aac4d1f2805d3f4580709461/qa/tasks/rgw.py", line 191, in start_rgw
wait_for_radosgw(url, remote)
File "/home/teuthworker/src/github.com_ceph_ceph-c_3c0f8c8164075af7aac4d1f2805d3f4580709461/qa/tasks/util/rgw.py", line 94, in wait_for_radosgw
assert exit_status == 0
AssertionError
```
Kefu Chai [Sun, 8 Aug 2021 17:21:38 +0000 (01:21 +0800)]
crimson/common: instantiate interrupt_cond in .cc
so we can explicitly instantiate it.
this should address the segfault when accessing interrupt_cond when
it is defined as a plain thread local storage template variable in the
header file.
it seems Clang is not able to identify the access to TLS variable and
the value of %fs segment register of the main thread is always zero if
interrupt_cond is defined as a plain global variable stored in
thread local storage.
librbd/cache/pwl/rwl: fix buf_persist and add writeback_lat perf counters
initialize buf_persist_time, then change name buf_persist_time to
buf_persist_start_time, change flush to internal_flush. add
writeback_lat perf conters. update some print formats for perf.
Casey Bodley [Fri, 6 Aug 2021 19:14:26 +0000 (15:14 -0400)]
radosgw-admin: 'sync status' is not behind if there are no mdlog entries
if remote mdlogs are trimmed prematurely, sync status will report
that it's behind the remote's max-marker even if there are no mdlog
entries to sync
for each behind shard, we fetch the next mdlog entry from the remote. if
we get an empty listing, remove that shard from behind_shards. this
logic now has to run before we print "behind shards:" so that empty
shards aren't listed
librbd/cache/pwl: avoid stack overflow caused by nested shared_ptr destruction
Destruction of nested shared_ptr will cause stack overflow.
With the explicit assignment of nullptr, the deleted node
is completely disconnected from the current linked list
Kefu Chai [Fri, 6 Aug 2021 11:48:19 +0000 (19:48 +0800)]
test/crimson: mark final class "final"
silences warning from Clang like:
../src/test/crimson/seastore/test_object_data_handler.cc:33:16: warning: class with destructor marked 'final' cannot be inherited from [-Wfinal-dtor-non-final-class]
~TestOnode() final = default;
^
../src/test/crimson/seastore/test_object_data_handler.cc:20:7: note: mark 'TestOnode' as 'final' to silence this warning
class TestOnode : public Onode {
^
1 warning generated.
Kefu Chai [Fri, 6 Aug 2021 09:26:16 +0000 (17:26 +0800)]
cmake: fail on unknown attribute
on Clang, the option for detecting unknown attribute is
-Wunknown-attributes, so "-Wattributes -Werror" does not fail the test
when the C compiler is Clang.
in this change, we just turn all warnings into errors.
this should fail the test if the compiler does not understand
`__attribute__((__symver__ ...))`
See Rook issue https://github.com/rook/rook/issues/7940 for full
information.
Ceph bluestore disks can sometimes appear as though they have "phantom"
Atari (AHDI) partitions created on them when they don't in reality. This
is due to a series of bugs in the Linux kernel when it is built with
Atari support enabled. This behavior does not appear for raw mode OSDs on
partitions, only on disks.
Changing the on-disk format of Bluestore OSDs comes with
backwards-compatibility challenges, and fixing the issue in the Kernel
could be years before users get a fix. Working around the Kernel issue
in ceph-volume is therefore the best place to fix the issue for Ceph.
To work around the issue in Ceph volume, there are two behaviors that need
adjusted:
1. `ceph-volume inventory` should not report that a partition is
available if the parent device is a BlueStore OSD.
2. `ceph-volume raw list` should report parent disks if the disk is a
BlueStore OSD and not report the disk's children, BUT it should still
report children if the parent disk is not a BlueStore OSD.
=============================DEBUG ASSISTANCE==========================
If you are seeing an error here please try the following to
successfully install cryptography:
Upgrade to the latest pip and try again. This will fix errors for most
users. See: https://pip.pypa.io/en/stable/installing/#upgrading-pip
=============================DEBUG ASSISTANCE==========================
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-build-7fhnk5us/cryptography/setup.py", line 14, in <module>
from setuptools_rust import RustExtension
ModuleNotFoundError: No module named 'setuptools_rust'