Luis Domingues [Tue, 19 Jul 2022 09:04:34 +0000 (11:04 +0200)]
mgr/cephadm: add parsing for config on osd specs
Cephadm, while parsing spec files, can parse ceph configuration
for almost all the services, except for OSDs, where it fails
with a nasty "unexpected keyword argument config".
This commit fixes this issue.
Signed-off-by: Luis Domingues <domingues.luis@protonmail.ch>
Ilya Dryomov [Fri, 20 May 2022 12:05:03 +0000 (14:05 +0200)]
qa/suites/rbd: disable workunit timeout for dynamic_features_no_cache
The I/O workload in this test is xfstests (qa/run_xfstests_qemu.sh)
which isn't subjected to any timeout other than global max_job_time
limit in any other subsuite (e.g. qemu/workloads/qemu_xfstests.yaml).
But here, there is a parallel "op" workload defined as a workunit.
The workunit task has a default timeout of 3 hours which is effectively
imposed on the entire job. In the "rbd cache = false" configuration,
it's sometimes exceeded.
rbd: don't default empty pool name unless namespace is specified
Commit 96f05a7956b3 ("rbd: delay determination of default pool name")
broke "rbd perf image iostat" and "rbd perf image iotop" GLOBAL_POOL_KEY
support (the ability to blend all rbd pools together into a single
view).
It doesn't really thrash anything, just repeatedly restarts the
workload on top of a dirty cache file. rbd_pwl_cache_recovery is
more on point and gets covered by existing CODEOWNERS.
* make paxos_size() unsigned, as paxos_size() returns the size of
MonMap::mon_info, so it should be always a non-negative value,
and more importantly, it represents a size.
* change the type of MonMap::removed_ranks from std::set<int>
to std::set<unsigned>. for two reasons:
- removed_ranks only tracks the rank which is greater or equal to 0
- helps to silence the warnings listed below.
MonMap::removed_ranks is persisted using encode()/decode(), but this
change is backward compatible, as we use the raw encoder to encode
signed and unsigned integers, the difference between the encoding
schema between them only matters when MSB in the number is used,
but this is not likely happen, as we neither have a negative
rank in removed_ranks, no have a rank greater than `(unsigned)-1`,
i.e., 0xffffffff.
/home/kefu/dev/ceph/src/mon/ElectionLogic.cc: In member function ‘void ElectionLogic::end_election_period()’:
/home/kefu/dev/ceph/src/mon/ElectionLogic.cc:173:23: error: comparison of integer expressions of different signedness: ‘std::set<int>::size_type’ {aka ‘long unsigned int’} and ‘int’ [-Werror=sign-compare]
173 | acked_me.size() > (elector->paxos_size() / 2)) {
| ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/kefu/dev/ceph/src/mon/ElectionLogic.cc: In member function ‘void ElectionLogic::propose_connectivity_handler(int, epoch_t, const ConnectionTracker*)’:
/home/kefu/dev/ceph/src/mon/ElectionLogic.cc:338:28: error: comparison of integer expressions of different signedness: ‘unsigned int’ and ‘int’ [-Werror=sign-compare]
338 | for (unsigned i = 0; i < elector->paxos_size(); ++i) {
| ~~^~~~~~~~~~~~~~~~~~~~~~~
/home/kefu/dev/ceph/src/mon/ElectionLogic.cc: In member function ‘void ElectionLogic::receive_ack(int, epoch_t)’:
/home/kefu/dev/ceph/src/mon/ElectionLogic.cc:469:25: error: comparison of integer expressions of different signedness: ‘std::set<int>::size_type’ {aka ‘long unsigned int’} and ‘int’ [-Werror=sign-compare]
469 | if (acked_me.size() == elector->paxos_size()) {
| ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors
make[3]: *** [src/mon/CMakeFiles/mon.dir/build.make:328: src/mon/CMakeFiles/mon.dir/ElectionLogic.cc.o] Error 1
make[3]: *** Waiting for unfinished jobs....
make[3]: Leaving directory '/home/kefu/dev/ceph/build'
[ 48%] Built target libglobal_objs
/home/kefu/dev/ceph/src/mon/Elector.cc: In member function ‘void Elector::notify_rank_removed(int)’:
/home/kefu/dev/ceph/src/mon/Elector.cc:734:43: error: comparison of integer expressions of different signedness: ‘unsigned int’ and ‘int’ [-Werror=sign-compare]
734 | for (unsigned i = rank_removed + 1; i <= paxos_size() ; ++i) {
| ~~^~~~~~~~~~~~~~~
Currently, the following transaction exec sequence would lead to
loss of backref:
1. Trans `A` merge a alloc backref for extent `X`
2. Trans `B` add a release backref for extent `X` to backref cache,
during which it finds an in-cache alloc backref for extent `X` and
decide not to add the release backref to cache
3. Trans `A` commit
In the above sequece, the release backref for extent `X` is lost.
This is a regression introduced when we try to optimize the backref cache.
This commit fix the issue by caching inflight backrefs in a multiset,
alloc/release ops that happen on the same paddr are queued in the order of
their happening. When doing gc, all those backrefs are merged.
Samuel Just [Tue, 12 Jul 2022 22:35:44 +0000 (22:35 +0000)]
crimson/osd: introduce pg_shard_manager to clarify shard-local vs osd-wide state
This commits begins to change ShardServices to be the interface by which
PGs access shard local and osd wide state. Future work will further
clarify this interface boundary and introduce machinery to mediate cold
path access to state on remote shards.
Tatjana Dehler [Thu, 7 Jul 2022 15:21:14 +0000 (17:21 +0200)]
mgr/dashboard: prevent alert redirect
Prevent Alertmanager alerts from being redirected to the active mgr
dashboard instance. There are two reasons for it:
1. It doesn't bring any additional benefit. The Alertmanager config
includes all available mgr instances - active and passive ones. In
case of an alert, it will be sent to all of them. It ensures that
the active mgr dashboard will receive the alert in any case.
2. The redirect URL includes the mgr IP and NOT the FQDN. This leads
to issues in environments where an SSL certificate is configured and
matches the FQDNs, only.
Fixes: https://tracker.ceph.com/issues/56401 Signed-off-by: Tatjana Dehler <tdehler@suse.com>
Yin Congmin [Fri, 7 Jan 2022 07:03:44 +0000 (15:03 +0800)]
qa/tasks: add thrash test for persistent write log cache
add thrash test for persistent write log cache. run rbd bench
on persistent write log cache, thrashes rbd bench, test the
recovery function of persistent write log cache.
crimson/os/seastore/cache: fine-grained lru cache control with GC
GC transaction is not sourced by user behaviors, so the extent read
operations from GC transaction don’t satisfy the time locality
principle. These extents should not be added to LRU cache.
Zack Cerza [Tue, 21 Jun 2022 17:28:30 +0000 (11:28 -0600)]
ceph-volume: Rename env var; add warning
So that we can have a nice big warning that fires once per invocation, I
think using a callable class with a class attribute seems like a decent
approach. A closure could work too.
Zack Cerza [Tue, 17 May 2022 17:29:02 +0000 (11:29 -0600)]
ceph-volume: Optionally consume loop devices
A similar proposal was rejected in #24765; I understand the logic
behind the rejection, but this will allow us to run Ceph clusters on
machines that lack disk resources for testing purposes. We just need to
make it impossible to accidentally enable, and make it clear it is
unsupported.
Image contexts are reopen even though we pass the context as an
argument. This commit changes that so you can forget about reopening
a rbd image context again.
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
mgr/dashboard: add rbd list search and disable sorting
- Disable sorting in each column because it will not be possible to
sort with this pagination implementation.
- Add search capabilities to the rbd list pagination endpoint.
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>