Adam King [Mon, 26 Sep 2022 18:02:19 +0000 (14:02 -0400)]
mgr/cephadm: fix handling of mgr upgrades with 3 or more mgrs
Fixes: https://tracker.ceph.com/issues/57675
When daemons are upgraded by cephadm, there are two criteria taken into
account for a daemon to be considered totally upgraded. The first is the
container image the daemon actually has currently. The second is the container
image of the mgr that deployed the daemon. I'll refer to these as a daemon
having the "correct version" and "correct deployed by". For reference,
the correct deployed by needs to be tracked as cephadm may change
something about the unit files it generates between versions and not
making sure daemons are deployed by the current version of cephadm
risks some obscure bugs.
The function _detect_need_upgrade takes a list of daemons and returns
two new lists. The first is all daemons from the input list that
are on the wrong version. The second are all daemons that are on the
right version but deployed by the wrong version. Additionally it returns
a bool to say whether the current active mgr must be upgraded (i.e. it
would belong in either of the two returned lists). Prior to this change,
how it would work is the second list (list of daemons that are on the right
version but have the wrong deployed by version) would simply be added to
the first list if the active mgr does not need to be upgraded. The idea
is that if you are upgrading from X image to Y image, we can only
really "fix" the deployed by version of the daemon if the active mgr
is on the Y version as it will be the one deploying the daemon. So if
the active mgr is not upgraded we can just ignore the daemons that just
have the wrong deployed by version in hte current iteration. All of this is
really only important when the mgr daemons are being upgraded. After all the
mgrs are upgraded any future upgrades of daemons will be done by a mgr on
the new version so deployed by version will always get completed
along with the version of the daemon itself. This system also works fine
for the typical 2 mgr setup.
Imagine mgr A and B on version X deployed by version X being upgraded to
version Y with A as active. First A deploys B with version Y. Now B
has version Y and deployed by version X. A then fails over to B as it
sees it needs to be upgraded. B then upgrades A so A now has version Y
and deployed by version Y. B then fails over to A as it sees it needs
to be upgraded as its deployed by version is still X. Finally, A
redeploys B and both mgrs are fully upgraded and everything is fine.
However, things can get trickier with 3 or more mgrs due to the
fact that cephadm does not control which other mgr takes over after
a failover. Imagine a similar scenario but now you have mgr
A, B, and C. First A will upgrade B and C to Y so they now
are both on version Y with deployed by version X. It then fails
over since it needs to be upgraded and let's say B takes over as
active. B then upgrade A so it now has version Y and deployed by
version Y. However, it will not redeploy C even though it should
as, given it sees that it needs to be upgraded due to its deployed by
version being wrong, it doesn't touch any daemon that just needs its
deployed by version fixed. It then fails over and lets say C takes
over. Since it still has the wrong deployed by version and therefore
thinks that it needs to be upgraded, it won't touch B since that
only needs its deployed by version fixed. It sees that it needs
to be upgraded however so it fails over. Lets say B takes over again.
You can see how we can end up in a loop here where B and C say they
need to be upgraded but never upgrade each other. It seems from what
I've seen that which mgr is picked after a failover isn't totally
random so this type of scenario can actually happen and it can get
stuck here until the user takes some action. The change here is
to, instead of not touching daemons that needs their deployed by version
fixed if the active mgr needs upgrade, only don't touch that list
if the active mgr is on the wrong version. So in our example scenario
B would still have upgraded C the first time around as it would
see it is on the correct version Y and can therefore fix the deployed
by version for C. This is what the check always should have been
but since most of the testing is with 2 mgr daemons and even with
more its by chance you end up in the loop this issue wasn't seen.
Will add that it is also possible to end up in this loop with
only 2 mgr daemons if some amount of manual upgrading of the mgr
daemons is done.
crimson: create buffer from temporary_buffer with foreign-ptr by default
temporary_buffer is internally shareable with a thread-unsafe
ref-counter, we need to make sure it is released in the same core where
it is constructed.
Users that need the extra efficiency can swap to create_local as needed.
Adam King [Thu, 22 Sep 2022 13:54:06 +0000 (09:54 -0400)]
Merge pull request #47756 from dparmar18/wip-dparmar-cephadm-after-revert
pybind/mgr/cephadm/upgrade: allow upgrades without reducing max_mds
Reviewed-by: Adam King <adking@redhat.com> Reviewed-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
bea9f4b643c introduced a regression that makes the activate process
take a very long time to complete.
`_get_bluestore_info()` which calls `ceph-bluestore-tool` binary via
subprocess is called in an exponential way while this is not needed.
Michael Fritch [Tue, 20 Sep 2022 22:20:15 +0000 (16:20 -0600)]
cephadm: patch the `cephadm.logger` class
Patch the logger class instead of globally mocking the class from within
the loaded source file. This was inadvertently allowing for the entire
test run to succeed, while a single run of a test case would fail due to
the missing mock.
For example:
`tox -e py3 tests/test_cephadm.py::TestShell::test_fsid`
Fixes: Fixes: https://tracker.ceph.com/issues/57621 Signed-off-by: Michael Fritch <mfritch@suse.com>
osd/scrub: make on_replica_init() idempotent again
on_replica_init() might be called twice during replica scrub
initiation. Thus, it was designed to cause no issue if called
an extra time. That was broken when the scrubber-backend code
was introduced.
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Fixes issue when pid file config comes empty from config dump which prevents to add metrics. Also get process metrics only if
pid_path isn't empty.
Nizamudeen A [Fri, 16 Sep 2022 07:20:26 +0000 (12:50 +0530)]
mgr/dashboard: use service call instead of form component
For creating the silence from the notification sidebar, instead of using
the silence form which will require initializing the whole component on
the landing page, we can just call the prometheus service and pass on
the required data to the service call. This will fix showing the
`Prometheus not configured` error everytime we visit the landing page when
the prometheus is not configured
Fixes: https://tracker.ceph.com/issues/57576 Signed-off-by: Nizamudeen A <nia@redhat.com>
Sungmin Lee [Tue, 23 Aug 2022 04:51:31 +0000 (13:51 +0900)]
qa: add validation stage for deduplication.py
To validate sample-dedup actually works, validate() runs
separated thread from sample-dedup and verifies
two following things.
1. check sample-dedup starts properly.
2. check references of all the chunk objects' in chunk tier
exists in designated base pool.
This routune repeats for max_valication_cnt times while
sample-dedup is running. If it doesn't raise any fail while the loop,
we can pretend sample-dedup works accurately.
If not, assert() will stop this test.
In case that a reference of chunk object doesn't exist in base pool,
validate() gives a second chance after repairing it (chunk-repair op)
to deal with false-positive reference inconsistency.
Signed-off-by: Sungmin Lee <sung_min.lee@samsung.com>
mgr/dashboard: Add details to the modal which displays the `safe-to-destroy` result
- Add warnings type information in the case of the OSDs are not safe to destroy
- Add info type information in the case of the OSDs are safe to destroy
Fixes: https://tracker.ceph.com/issues/37327 Signed-off-by: Francesco Torchia <francesco.torchia@suse.com>
Adam C. Emerson [Fri, 16 Sep 2022 00:05:57 +0000 (20:05 -0400)]
build: Remove -fno-new-ttp-matchingg flag
This was added in the upgrade to C++17. It's no longer needed since
fixing Clang compatibility got rid of non-conforming templates.
It's no longer needed and getting rid of it is a (minor) quality of
life enhancement since it gets rid of a spurious error when using
Clang based build tools (language server, etc.) while compiling with
GCC.
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>