Joseph Sawaya [Fri, 3 Sep 2021 14:28:57 +0000 (10:28 -0400)]
mgr/rook: add unit tests to Rook module
This commit creates a unit test folder for rook, configurates tox
and creates a unit test to test the translation of placement specs
to node selectors and vice versa.
Joseph Sawaya [Fri, 3 Sep 2021 14:24:24 +0000 (10:24 -0400)]
mgr/rook: translate placement spec to node selector and vice versa
This commit creates methods for translating PlacementSpecs to NodeSelectors
and vice versa, this is a general method that would be used by orch commands that
have a placement option. It's able to translate specs that specify a label, hosts
and the host_pattern '*'.
When retiring, m_blocks_to_log_entries doesn't remove
corresponding write_entry (should be `*it` not `entry`)
that will be retired. It leads to read error. And
there should also consider discard entries.
Joseph Sawaya [Tue, 31 Aug 2021 16:14:06 +0000 (12:14 -0400)]
mgr/rook: host add/rm label in rook orchestrator
This commit adds the functionality for adding/removing labels
to hosts using the rook orchestrator and orch host label add/rm.
The labels are added to the kubernetes node object as kubernetes
labels prefixed by "ceph-label/".
This commit also changes ceph orch host ls to display only the ceph
labels.
Laura Flores [Wed, 8 Sep 2021 21:22:45 +0000 (16:22 -0500)]
doc/dev: specify location of e2e-tests script
I was just going through these steps myself, and I found it difficult to locate and run `run-frontend-e2e-tests.sh`. Specifying the location of it might make it clearer to others where the correct script is located.
When running the `lvm migrate` subcommand without any args then the
ceph-volume command fails with a stack trace.
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc
return f(*a, **kw)
File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 151, in main
terminal.dispatch(self.mapper, subcommand_args)
File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
instance.main()
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/main.py", line 46, in main
terminal.dispatch(self.mapper, self.argv)
File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
instance.main()
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/migrate.py", line 520, in main
self.migrate_osd()
File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
return func(*a, **kw)
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/migrate.py", line 403, in migrate_osd
if self.args.osd_id:
AttributeError: 'Migrate' object has no attribute 'args'
That's because we're exiting the parse_argv function but we continue to
execute the migrate_osd function. We should instead exit from the main function.
This update the parsing argument to have the same code than new-db and
new-wal classes.
Now the parsing is done in the make_parser function but the argv testing is
done in the main function allowing to exit the program and displaying the
help message when no arguments are provided.
Merge pull request #43010 from mgfritch/cephadm-log-thread-ident
cephadm: add thread ident to log messages
Reviewed-by: Adam King <adking@redhat.com> Reviewed-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com> Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Concurrent rbd_pool_init() or rbd_create() operations on an unvalidated
(uninitialized) pool trigger a lockup in ValidatePoolRequest state
machine caused by blocking selfmanaged_snap_{create,remove}() calls.
There are two reactor threads by default (librados_thread_count) but we
effectively need N + 1 reactor threads for N concurrent pool validation
requests, especially for small N.
Switch to aio_selfmanaged_snap_{create,remove}(). At the time this
code was initially written, these aio variants weren't available. The
workqueue offload introduced later worked prior to the move to asio in
pacific.
doc/rbd: describe Hyper-V disk addressing limitations
Hyper-V identifies passthrough VM disks by number instead of SCSI ID, although
the disk number can change across host reboots. This means that the VMs can end
up using incorrect disks after rebooting the host, which is an important
security concern. This issue also affects iSCSI and Fibre Channel disks.
We're going to document this Hyper-V limitation along with possible
workarounds.
When a transaction is interrupted and needs to repeat, its reset_preserve_handle() method
is called to clear various extent set and list. The problem is the clear operation call
ExtentIndex::clear() instead of ExtentIndex::erase(), which would leave CachedExtent::parent_index
still pointing to the write_set/delayed_set while the CachedExtent is no longer linked to
those sets. This would make CachedExtent::~CachedExtent() to try to erase itself from
CachedExtent::parent_index even it's not linked, which would cause failures in the successive
operations
Xuehan Xu [Fri, 13 Aug 2021 13:57:27 +0000 (21:57 +0800)]
crimson/os/seastore: set journal_tail_target during replay
This is a bug fix, otherwise if crimson-osd boot up multiple times without
filling up more than one segment, segments may be used up and can't be
reclaimed as they would have the same journal tail
As there will be two kinds of segments to be scanned, those created by the journal
and those created by the extent placement manager. We need a common module to scan
extents of both of these two kinds of segments
Jianpeng Ma [Wed, 8 Sep 2021 01:51:19 +0000 (09:51 +0800)]
librbd: Read request need exclusive-lock when enable pwl-cache.
TestLibRBD.TestFUA descript the following workload:
a)write/read the same image w/ pwl-cache
write_image = open(image_name);
read_image = open(image_name);
b)i/o workload is:
write(write_image)
write need EXLock and require EXLOCK
read(read_image)
in ExclusiveLock<I>::init(), firstly read need EXLOCK
so will require EXLOCK. write_image release EXLOCK(will
flush data to osd and remove cache). read_image init pwl-cache
and read-io firstly enter pwl-cache and missed and then read
from osd.
write(write_image)
write need EXLOCK and require EXLOCK. This make read_image remove
empty cache. write_image init cache pool and write data to cache.
read(read_image)
In send_set_require_lock(), it set write need EXLOCK.
So read don't require EXLOCK and dirtyly read from osd.
Because second-read don't need EXLOCK and make write_image don't
release EXLOCK(flush dirty data to osd and shutdown pwl-cache).
This make second-read don't read the latest data.
So we should make read also need EXLOCK when enable pwl-cache.
Fixes: https://tracker.ceph.com/issues/51438 Tested-by: Feng Hualong <hualong.feng@intel.com> Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
This is a regression introduced by 9212420, when the host is using a
logical partition then lsblk reports that partition as a child from the
physical device.
That logical partition is prefixed by the `└─` character.
This leads the `raw list` subcommand to show the lsblk error on the stderr.
```
$ ceph-volume raw list
{}
stderr: lsblk: `-/dev/sda1: not a block device
```