mgr/dashboard: introduce memory and cpu usage for daemons
Fixes: https://tracker.ceph.com/issues/55218 Signed-off-by: Avan Thakkar <athakkar@redhat.com> Co-authored-by: Aashish Sharma <aasharma@redhat.com>
Introducing 2 new columns in Cluster->Host->Daemons table for Memory and CPU usage.
Conflicts:
src/pybind/mgr/cephadm/module.py
- _process_ls_output() doesn't exist in pacific as agent isn't yet backported. So similar changes
needs to be done in serve.py instead.
In `master` the milestone step exits and causes remaining tasks not to be run. I previously tried with the `continue-on-error` flag, but it didn't work, so let's try putting that steps at the end.
Laura Flores [Mon, 16 May 2022 22:59:42 +0000 (17:59 -0500)]
qa/suites/rados/thrash-erasure-code-big/thrashers: add `osd max backfills` setting to mapgap and pggrow
All `rados/thrash-erasure-code-big` tests that die due to the “wait_for_recovery” timeout have one thing in common: They contain either `thrashers/pggrow` or `thrashers/mapgap`.
The difference between pggrow and mapgap vs. all other non-offending thrashers (default, careful, fastread, and morepggrow) is that they lack an override setting for `osd max backfills`. `osd max backfills` is the max number of backfill operations allowed to/from an OSD. The higher the number, the quicker the recovery. By default, this value is 1. On all of the non-offending thrashers (default, careful, fastread, and morepggrow), the default 1 value gets overridden in their .yaml files with a value > 1. This is not the case for pggrow and mapgap, however, as they lack an `osd max backfills` override setting.
The mclock op scheduler is known to override `osd max backfills` with a high value, but all of the thrash-erasure-code-big thrashers have their op queue set to “debug_random”, which chooses randomly between op queues (the debug_random op queue is set to override the default mclock_scheduler in qa/config/rados.yaml). So, coupled with the “debug_random” op queue, the low `osd max backfill` setting is causing some tests to time out in recovery.
WITHOUT `osd max backfills`, as they are now, “mapgap” and “pggrow” tests die due to timed-out recovery about 17/100 times, as seen here with a pggrow test: http://pulpito.front.sepia.ceph.com/lflores-2022-05-18_14:24:29-rados:thrash-erasure-code-big-master-distro-default-smithi/
WITH `osd max backfills` specified, as I have suggested in this PR, 99/100 tests passed, with one test failing for a different reason:
http://pulpito.front.sepia.ceph.com/lflores-2022-05-17_22:40:27-rados:thrash-erasure-code-big-master-distro-default-smithi/
I also scheduled 145 tests WITH `osd max backfills` that are a mix of pggrow and mapgap thrashers. 144/145 tests passed, with one test failing for a different reason. http://pulpito.front.sepia.ceph.com/lflores-2022-05-17_15:27:54-rados:thrash-erasure-code-big-master-distro-default-smithi/
Fixes: https://tracker.ceph.com/issues/51076 Signed-off-by: Laura Flores <lflores@redhat.com>
(cherry picked from commit 40062676c2ceed49b9fa147127ffa83ba6118e2a)
Adam King [Fri, 1 Apr 2022 12:20:28 +0000 (08:20 -0400)]
mgr/cephadm: make UpgradeState from_json a bit safer
This way, for downgrades to whatever versions
this lands in onward, having added new parameters to
UpgradeState shouldn't break anything. Can't do much
about downgrades to older versions from this one
but this should help in the future.
Adam King [Mon, 28 Mar 2022 16:10:15 +0000 (12:10 -0400)]
mgr/cephadm: split _do_upgrade into sub functions
This function was around 500 lines and difficult to work
with. Splitting it into sub functions should hopefully make
it a bit easier to understand and make changes to.
Volker Theile [Tue, 10 May 2022 13:25:54 +0000 (15:25 +0200)]
cephadm: prometheus: The generatorURL in alerts is only using hostname
Prometheus is currently using only the hostname in the 'generatorURL' of an alert which causes issues when clicking on the URL in the Ceph Dashboard or somewhere else, because in most cases the hostname of the node that is running the Prometheus container is not resolvable.
To fix that the command line argument '--web.external-url' must be appended in the systemd unit file of the Prometheus container, e.g. '--web.external-url http://foo.bar:9095' whereas a FQDN hostname is used.
Volker Theile [Mon, 9 May 2022 13:31:15 +0000 (15:31 +0200)]
mgr/dashboard: Creating and editing Prometheus AlertManager silences is buggy
When creating a new monitoring silence the form is pre-filled with the wrong alert data. It is always used the alert data from the very first object in the list of the API response but not the specified alert identified by the 'fingerprint' property.
The same problem applies to editing silences. The selected silence is not edited, it's always the first one in the list returned API response but not that with the specified 'id' property.
The main problem of the origin implementation is that the Prometheus Alertmanager API endpoints /api/v1/[alerts/silences] do not support querying. To fix that, filtering is done in the frontend.
Zac Dover [Wed, 18 May 2022 10:36:53 +0000 (20:36 +1000)]
doc/start: s/3/three/ in intro.rst
I'm changing "3" to "three" for two reasons:
1. It's correct.
2. This allows me to test backports into Octopus, Pacific, and Quincy.
I am particularly interested to see what happens when I attempt
the backport into Octopus, because backports into Octopus have
failed. This will provide me with another unit of data.
Adam King [Thu, 18 Nov 2021 20:22:39 +0000 (15:22 -0500)]
mgr/cephadm: re-use old ip when re-adding hosts if necessary
When a host is re-added without an explicit ip we can default to the old
ip we had stored for the host rather than either keeping the loopback
address or throwing an exception. We only want to actually error when
the only options left are error or use a resolved loopback address
Redouane Kachach [Tue, 17 May 2022 15:26:39 +0000 (17:26 +0200)]
mgr/cephadm: stripping out / from the end of the url Fixes: https://tracker.ceph.com/issues/55638 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 17032f6be22e9efc3e199d7e35091025bfaae965)
mgr/cephadm: do not add _admin label when no-minimize-config is provided Fixes: https://tracker.ceph.com/issues/52727 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 01c8999d0354a71a7ef8526aab9b39e30d67c1bb)
Moritz Röhrich [Mon, 21 Mar 2022 16:32:25 +0000 (17:32 +0100)]
cephadm: avoid crashing on expected non-zero exit
- Avoid crashing when a call out to an external program expectedly does
not return exit status zero.
There are programs that communicate other information than error/no
error through exit status. E.g. `systemctl status` will return different
exit codes depending on the actual status of the units in question.
In cases where this is expected crashing with a RuntimeError exception
is inappropriate and should be avoided.
Fixes: https://tracker.ceph.com/issues/55117 Signed-off-by: Moritz Röhrich <moritz.rohrich@suse.com>
(cherry picked from commit a02be6f22fa18094cd8758700ab74581b6ce1701)
Cory Snyder [Tue, 17 May 2022 09:24:53 +0000 (05:24 -0400)]
mgr/ActivePyModules.cc: fix cases where GIL is held while attempting to lock mutex
The mgr process can deadlock if the GIL is held while attempting to lock a mutex.
Relevant regressions were introduced in commit a356bac. This fixes those regressions
and also cleans up some unnecessary yielding of the GIL.
ceph-volume/tests: reject loop devices in lvm.conf
The current task doesn't works (typo?).
Otherwise api/lvm.py can't work properly, functions such as
`get_single_lv()` and many other don't return the expected results.
Indeed, lvm is confused because of the nvme_loop setup.
This adds the support of complex OSD creation with command
`orch daemon add osd`.
Any argument supported by `DriveGroupSpec()` can be passed on the command line.
Cephadm shouldn't try to deploy a disk reported as unavailable by ceph-volume.
The idea here is to check the rejection reason so we can still use DB devices
in case of OSD replacement.
Sage Weil [Thu, 12 Aug 2021 15:12:59 +0000 (11:12 -0400)]
ceph-volume: activate: try simple mode too
This is of dubious value to cephadm since /etc/ceph/osd/* won't be
populated inside of a conatiner. However, it makes sense from a purely
ceph-volume perspective.
Sage Weil [Thu, 5 Aug 2021 16:02:22 +0000 (12:02 -0400)]
ceph-volume: lvm activate: infer bluestore or filestore
No need to require --filestore and/or --bluestore args since we can tell
from the LV tags which one it is.
We can't drop the arguments without breaking existing users, though, so
redefine them to mean *force* bluesetore or filestore activation (even
though this will error out if the tags don't match).
dparmar18 [Fri, 25 Mar 2022 08:18:54 +0000 (13:48 +0530)]
doc/cephfs/add-remove-mds: added cephadm note, refined "Adding an MDS"
Description: 1) Add a note about using cephadm for setting up the
cluster and mds(s), also mention the use of ceph
orchestrator if one needs to setup mds(s) manually.
2) Changed the term `data point` to `directory` in
point 1 under "Adding an MDS" section for better
clarity.
Cory Snyder [Tue, 17 May 2022 09:24:53 +0000 (05:24 -0400)]
mgr/ActivePyModules.cc: fix cases where GIL is held while attempting to lock mutex
The mgr process can deadlock if the GIL is held while attempting to lock a mutex.
Relevant regressions were introduced in commit a356bac. This fixes those regressions
and also cleans up some unnecessary yielding of the GIL.
Adam Kupczyk [Fri, 29 Apr 2022 21:32:43 +0000 (23:32 +0200)]
kv/RocksDBStore: Remove feature to make WholeSpaceIterator based on bounded iterator
Iterator-bounding feature is introduced to make RocksDB iterators limited, so they
would less likely traverse over tombstones.
This is used when listing keys in fixed range, for example OMAPS for specific object.
It is problematic when extending this logic to WholeSpaceIterator,
since prefix must be taken into account.
Fixes: https://tracker.ceph.com/issues/55444 Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
Adds a precondition to RocksDBStore::get_cf_handle(string, IteratorBounds)
to avoid duplicating logic of the only caller (RocksDBStore::get_iterator).
Assertions will fail if preconditions are not met.
bluestore: add config option to allow rocksdb iterator bounds to be disabled
Add osd_rocksdb_iterator_bounds_enabled config option to allow rocksdb iterator bounds to be disabled.
Also includes minor refactoring to shorten code associated with IteratorBounds initialization in bluestore.
Signed-off-by: Cory Snyder <csnyder@iland.com>
(cherry picked from commit ca3ccd9)
Conflicts:
src/common/options/osd.yaml.in
Cherry-pick notes:
- Conflicts due to option definition in common/options.cc in Pacific vs. common/options/osd.yaml.in in later releases
bluestore: set upper and lower bounds on rocksdb omap iterators
Limits RocksDB omap Seek operations to the relevant key range of the object's omap.
This prevents RocksDB from unnecessarily iterating over delete range tombstones in
irrelevant omap CF shards. Avoids extreme performance degradation commonly caused
by tombstones generated from RGW bucket resharding cleanup. Also prefer CFIteratorImpl
over ShardMergeIteratorImpl when we can determine that all keys within specified
IteratorBounds must be in a single CF.