ceph.spec, debian: move deps from base to ceph-volume
this change makes util-linux, xfsprogs and e2fsprogs runtime deps of ceph-volume
ceph-volume uses blkid and lsblk, which are in turn packaged by
util-linux.
util-linux were added as build dependency to fulfill the needs of
ceph-disk. and we tested ceph-disk as part of "make check", since
ceph-disk was dropped, there is no need to have util-linux as
build dependency anymore.
but it should be a runtime dependency of ceph-volume. and it is not a
build dependency of ceph, so it's removed from the build dependency list
as well.
debian: split ceph-volume into a separated package
ceph-volume is a tool implemented in pure python, so it would be better
to make it a architecture independent package for better
maintainability.
in this change
* ceph-volume is extracted out into a separated package
* ceph-volume depends on ceph-osd, as it deploys it and relies on
an already-installed ceph-osd in the system.
* ceph-osd recommends ceph-volume. as ceph-osd can be used as a
standalone package. but ceph-volume enhances it. also, to ensure
the existing users to get ceph-volume installed along with
ceph-osd.
Samuel Just [Sat, 11 Sep 2021 04:11:24 +0000 (21:11 -0700)]
crimson/os/seastore/transaction_manager: fix delayed alloc addr in rewrite_logical_extent
Setting nlextent's to the addr of the original extent violates invariants
in the intrusive_set we're using for ExtentIndex and generally breaks the
assumption in cache that paddrs are unique until reuse. This means we'll
do an extra update_mapping for now -- we can refactor SegmentCleaner to
avoid it later.
Samuel Just [Fri, 10 Sep 2021 01:11:05 +0000 (18:11 -0700)]
crimson/os/seastore: replace fresh_block_list with inline/ool_block_list
With this change, Transaction::write_set is the union of mutually exclusive
lists
- delayed_alloc_list, inline_block_list, and mutated_block_list prior to
calling ExtentPlacementManager::delayed_alloc_or_ool_write
and
- inline_block_list, ool_block_list, and mutated_block_list after
calling ExtentPlacementManager::delayed_alloc_or_ool_write
It could be that `_carriage` points on empty bptr which is
also stored in the `_buffers` container -- this happens
when we'are trying to preserve the appendable space to not
waste memory and avoid new allocation.
Before the change this approach was imposing unncessary
rebuilding on `c_str()`. Now we take into consideration
the special case when a bufferlist has the extra, empty
bptr at its end.
The current check only allows to request an OSD id that exists but
marked as 'destroyed'.
With this small fix, we can now use `--osd-id` with an id that doesn't
exist at all.
When retiring, m_blocks_to_log_entries doesn't remove
corresponding write_entry (should be `*it` not `entry`)
that will be retired. It leads to read error. And
there should also consider discard entries.
Joseph Sawaya [Tue, 31 Aug 2021 16:14:06 +0000 (12:14 -0400)]
mgr/rook: host add/rm label in rook orchestrator
This commit adds the functionality for adding/removing labels
to hosts using the rook orchestrator and orch host label add/rm.
The labels are added to the kubernetes node object as kubernetes
labels prefixed by "ceph-label/".
This commit also changes ceph orch host ls to display only the ceph
labels.
Laura Flores [Wed, 8 Sep 2021 21:22:45 +0000 (16:22 -0500)]
doc/dev: specify location of e2e-tests script
I was just going through these steps myself, and I found it difficult to locate and run `run-frontend-e2e-tests.sh`. Specifying the location of it might make it clearer to others where the correct script is located.
When running the `lvm migrate` subcommand without any args then the
ceph-volume command fails with a stack trace.
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc
return f(*a, **kw)
File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 151, in main
terminal.dispatch(self.mapper, subcommand_args)
File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
instance.main()
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/main.py", line 46, in main
terminal.dispatch(self.mapper, self.argv)
File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
instance.main()
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/migrate.py", line 520, in main
self.migrate_osd()
File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
return func(*a, **kw)
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/migrate.py", line 403, in migrate_osd
if self.args.osd_id:
AttributeError: 'Migrate' object has no attribute 'args'
That's because we're exiting the parse_argv function but we continue to
execute the migrate_osd function. We should instead exit from the main function.
This update the parsing argument to have the same code than new-db and
new-wal classes.
Now the parsing is done in the make_parser function but the argv testing is
done in the main function allowing to exit the program and displaying the
help message when no arguments are provided.
Merge pull request #43010 from mgfritch/cephadm-log-thread-ident
cephadm: add thread ident to log messages
Reviewed-by: Adam King <adking@redhat.com> Reviewed-by: Juan Miguel Olmo MartÃnez <jolmomar@redhat.com> Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Concurrent rbd_pool_init() or rbd_create() operations on an unvalidated
(uninitialized) pool trigger a lockup in ValidatePoolRequest state
machine caused by blocking selfmanaged_snap_{create,remove}() calls.
There are two reactor threads by default (librados_thread_count) but we
effectively need N + 1 reactor threads for N concurrent pool validation
requests, especially for small N.
Switch to aio_selfmanaged_snap_{create,remove}(). At the time this
code was initially written, these aio variants weren't available. The
workqueue offload introduced later worked prior to the move to asio in
pacific.
doc/rbd: describe Hyper-V disk addressing limitations
Hyper-V identifies passthrough VM disks by number instead of SCSI ID, although
the disk number can change across host reboots. This means that the VMs can end
up using incorrect disks after rebooting the host, which is an important
security concern. This issue also affects iSCSI and Fibre Channel disks.
We're going to document this Hyper-V limitation along with possible
workarounds.
When a transaction is interrupted and needs to repeat, its reset_preserve_handle() method
is called to clear various extent set and list. The problem is the clear operation call
ExtentIndex::clear() instead of ExtentIndex::erase(), which would leave CachedExtent::parent_index
still pointing to the write_set/delayed_set while the CachedExtent is no longer linked to
those sets. This would make CachedExtent::~CachedExtent() to try to erase itself from
CachedExtent::parent_index even it's not linked, which would cause failures in the successive
operations
Xuehan Xu [Fri, 13 Aug 2021 13:57:27 +0000 (21:57 +0800)]
crimson/os/seastore: set journal_tail_target during replay
This is a bug fix, otherwise if crimson-osd boot up multiple times without
filling up more than one segment, segments may be used up and can't be
reclaimed as they would have the same journal tail
As there will be two kinds of segments to be scanned, those created by the journal
and those created by the extent placement manager. We need a common module to scan
extents of both of these two kinds of segments