John Mulligan [Tue, 2 Aug 2022 13:45:59 +0000 (09:45 -0400)]
mgr/volumes: drop pre-python 3.2 version checks
Based on other conversations we believe that there is no need to support
python versions lower than Python 3.6 for pacific and later. This means
it is safe to drop the remaining version checks for python
3.2.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Mon, 11 Jul 2022 20:44:00 +0000 (16:44 -0400)]
mgr/volumes: a lock to guard against races reading/writing config
Fixes: https://tracker.ceph.com/issues/55583
Use a python threading lock to avoid race conditions where the
config file is being both read and written to at the same time.
Before this change, the content of the config file being parsed could be
'corrupted' by the MetadataManager racing with itself. Along with the
previous two patches, additional logging was added to the mgr code to
produce the simplified version of the mgr log below:
```
[volumes INFO volumes.fs.operations.versions.metadata_manager] READ: b'[GLOBAL]\nversion = 2\ntype = clone\npath = /volumes/Park/babydino2/c9f773af-5221-49c6-846c-d65c0920ae3f\nstate = pending\n\n[source]\nvolume = cephfs\ngroup = Park\nsubvolume = Jurrasic\nsnapshot = dinodna0\n\n'
[volumes INFO volumes.fs.operations.versions.metadata_manager] READ: b''
[volumes INFO volumes.fs.operations.versions.metadata_manager] READ: b'[GLOBAL]\nversion = 2\ntype = clone\npath = /volumes/Park/babydino2/c9f773af-5221-49c6-846c-d65c0920ae3f\nstate = pending\n\n[source]\nvolume = cephfs\ngroup = Park\nsubvolume = Jurrasic\nsnapshot = dinodna0\n\n'
[volumes INFO volumes.fs.operations.versions.metadata_manager] wrote 203 bytes to config b'/volumes/Park/babydino2/.meta'
[volumes INFO volumes.fs.operations.versions.metadata_manager] READ: b'a0\n\n'
[volumes INFO volumes.fs.operations.versions.metadata_manager] READ: b''
[volumes ERROR volumes.module] Failed _cmd_fs_clone_cancel(clone_name:babydino2, format:json, group_name:Park, prefix:fs clone cancel, vol_name:cephfs) < "":
Traceback (most recent call last):
...
File "/usr/lib64/python3.6/configparser.py", line 1111, in _read
raise e
configparser.ParsingError: Source contains parsing errors: b'/volumes/Park/babydino2/.meta'
[line 13]: 'a0\n'
```
Looking at the above you can see that the log indicates a write to the
config file (of 203 bytes). This happens before the file has finished
reading and thus instead of getting an empty string indicating EOF, it
gets that last four bytes of the new content of the file. The lock
prevents the MetadataManager from both reading and writing the config
file at the same time.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Tue, 12 Jul 2022 22:33:07 +0000 (18:33 -0400)]
mgr/volumes: write volume metadata with shim class
Add a class that works a bit like a python file object so that we
can simplify the flush function. Providing a file-like object to
the ConfigParser's write function avoids unnecessary copies to
a StringIO object and makes the code easier to read.
With no more uses of StringIO, the StringIO imports are removed.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Tue, 12 Jul 2022 22:32:54 +0000 (18:32 -0400)]
mgr/volumes: read volume metadata file using read_string
The read_string method, available in Python 3.2 (we assume Python 3.6 as
our current minimum python versino), supports parsing a provided string
for ini-style configuration parameters. Refactoring the reading of the
config file from cephfs into a simple iterator function and then
providing it to the ConfigParser as a single string, allows us to avoid
using StringIO and simplifies the refresh function.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
test/common: disable tests for commutativity of operator==()
older C++ compiler like GCC-9 does not rewrite operator==() so that
`a == b` implies `b == a`, in other words:
operator==(const LHS& lhs, const RHS& rhs) is equivalent to
operator==(const RHS& rhs, const LHS& lhs). see
section 1.2 in https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0515r3.pdf
gcc-toolset-11-annobin is already installed, but ceph.spec.in adds
"-specs=/usr/lib/rpm/redhat/redhat-annobin-cc1" which needs the gcc
plugin too
resolves this failure during the cmake configure step:
-- Check for working CXX compiler: /opt/rh/gcc-toolset-11/root/usr/bin/c++
-- Check for working CXX compiler: /opt/rh/gcc-toolset-11/root/usr/bin/c++ - broken
CMake Error at /usr/share/cmake/Modules/CMakeTestCXXCompiler.cmake:59 (message):
The C++ compiler
Kefu Chai [Mon, 28 Feb 2022 15:12:22 +0000 (23:12 +0800)]
crimson/common: add alternative overload for FixedKVNodeLayout<>::iter_t
otherwise we'd have following FTBFS with C++20 and clang13:
In file included from /var/ssd/ceph/src/crimson/os/seastore/lba_manager.cc:7:
In file included from /var/ssd/ceph/src/crimson/os/seastore/lba_manager/btree/btree_lba_manager.h:24:
/var/ssd/ceph/src/crimson/os/seastore/lba_manager/btree/lba_btree.h:474:26: error: use of overloaded operator '==' is ambiguous (with operand types 'crimson::common::FixedKVNodeLayout<254, crimson::os::seastore::l$
auto end = next_iter == parent->end()
~~~~~~~~~ ^ ~~~~~~~~~~~~~
/var/ssd/ceph/src/crimson/common/fixed_kv_node_layout.h:127:10: note: candidate function
bool operator==(const iter_t &rhs) const {
^
/var/ssd/ceph/src/crimson/common/fixed_kv_node_layout.h:127:10: note: candidate function (with reversed parameter order)
Adam King [Tue, 26 Jul 2022 13:55:05 +0000 (09:55 -0400)]
mgr/cephadm: clear error message when resuming upgrade
the message field in the output of "ceph orch upgrade status"
will first take the value of the error field of the UpgradeState,
and if only if it blank/None, display an info string we periodically
update throughout the upgrade with useful info such as that
we're upgrading a daemon of a particular type or pulling an image
on a certain host. When an upgrade fails, we set the error field
of the UpgradeState, pause the upgrade and raise a health warning.
Sometimes, the user is able to resolve the issue and simply resume
the upgrade. The issue here is, in that case, the error field of
the UpgradeState is still set, so instead of seeing the useful info
messages, it will continue to display an error message that may
no longer be relevant. By emptying the error field of the UpgradeState
when upgrades are resumed, we return to normal behavior of
displaying the info string, and will only show another error message
if another error actually occurs.
Fixes: https://tracker.ceph.com/issues/56714 Signed-off-by: Adam King <adking@redhat.com>
mgr/volumes: Fix subvolume creation in FIPS enabled system.
The md5 checksum is used in the construction of legacy
subvolume config filename. It's not used for security reason.
Hence marking the 'usedforsecurity' flag to false to
make it FIPs compliant.
The usage of md5 was always in there. The commit 373a04cf734
made it to get exercised in 'open_subvol' which is pre-requisite
for all the subvolume operations and hence subvolume
creation has failed.
When the class `Device` is instantiated with a path instead of a
block device, it fails like following.
```
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.6/site-packages/ceph_volume/util/device.py", line 130, in __init__
self._parse()
File "/usr/lib/python3.6/site-packages/ceph_volume/util/device.py", line 233, in _parse
self.ceph_device = disk.has_bluestore_label(self.path)
File "/usr/lib/python3.6/site-packages/ceph_volume/util/disk.py", line 906, in has_bluestore_label
with open(device_path, "rb") as fd:
IsADirectoryError: [Errno 21] Is a directory: '/var/lib/ceph/osd/ceph-0/'
```
passing a path instead of a block device is valid, `simple scan` needs it.
ceph-volume unit tests shouldn't actually create contents on the
filesystem from where it runs (even though they are written in a tmp
dir), let's use pyfakefs.
when option --gpg-url is specified, the name used for the gpg filename is missing and throws an exception
this adds the string "manual" to the gpg key : /etc/apt/trusted.gpg.d/ceph.manual.gpg
tools/ceph-dencoder: register dencoders in "lib" in dev env
if "CMakeCache.txt" is found in current directory, try to load
dencoder shared libraries from ./lib. this heuristics is used by
`ceph.in` also for relaunching itself to get access to python
bindings.