git.apps.os.sepia.ceph.com Git

mgr/cephadm: fix handling of mgr upgrades with 3 or more mgrs

Fixes: https://tracker.ceph.com/issues/57675
When daemons are upgraded by cephadm, there are two criteria taken into
account for a daemon to be considered totally upgraded. The first is the
container image the daemon actually has currently. The second is the container
image of the mgr that deployed the daemon. I'll refer to these as a daemon
having the "correct version" and "correct deployed by". For reference,
the correct deployed by needs to be tracked as cephadm may change
something about the unit files it generates between versions and not
making sure daemons are deployed by the current version of cephadm
risks some obscure bugs.

The function _detect_need_upgrade takes a list of daemons and returns
two new lists. The first is all daemons from the input list that
are on the wrong version. The second are all daemons that are on the
right version but deployed by the wrong version. Additionally it returns
a bool to say whether the current active mgr must be upgraded (i.e. it
would belong in either of the two returned lists). Prior to this change,
how it would work is the second list (list of daemons that are on the right
version but have the wrong deployed by version) would simply be added to
the first list if the active mgr does not need to be upgraded. The idea
is that if you are upgrading from X image to Y image, we can only
really "fix" the deployed by version of the daemon if the active mgr
is on the Y version as it will be the one deploying the daemon. So if
the active mgr is not upgraded we can just ignore the daemons that just
have the wrong deployed by version in hte current iteration. All of this is
really only important when the mgr daemons are being upgraded. After all the
mgrs are upgraded any future upgrades of daemons will be done by a mgr on
the new version so deployed by version will always get completed
along with the version of the daemon itself. This system also works fine
for the typical 2 mgr setup.

Imagine mgr A and B on version X deployed by version X being upgraded to
version Y with A as active. First A deploys B with version Y. Now B
has version Y and deployed by version X. A then fails over to B as it
sees it needs to be upgraded. B then upgrades A so A now has version Y
and deployed by version Y. B then fails over to A as it sees it needs
to be upgraded as its deployed by version is still X. Finally, A
redeploys B and both mgrs are fully upgraded and everything is fine.

However, things can get trickier with 3 or more mgrs due to the
fact that cephadm does not control which other mgr takes over after
a failover. Imagine a similar scenario but now you have mgr
A, B, and C. First A will upgrade B and C to Y so they now
are both on version Y with deployed by version X. It then fails
over since it needs to be upgraded and let's say B takes over as
active. B then upgrade A so it now has version Y and deployed by
version Y. However, it will not redeploy C even though it should
as, given it sees that it needs to be upgraded due to its deployed by
version being wrong, it doesn't touch any daemon that just needs its
deployed by version fixed. It then fails over and lets say C takes
over. Since it still has the wrong deployed by version and therefore
thinks that it needs to be upgraded, it won't touch B since that
only needs its deployed by version fixed. It sees that it needs
to be upgraded however so it fails over. Lets say B takes over again.
You can see how we can end up in a loop here where B and C say they
need to be upgraded but never upgrade each other. It seems from what
I've seen that which mgr is picked after a failover isn't totally
random so this type of scenario can actually happen and it can get
stuck here until the user takes some action. The change here is
to, instead of not touching daemons that needs their deployed by version
fixed if the active mgr needs upgrade, only don't touch that list
if the active mgr is on the wrong version. So in our example scenario
B would still have upgraded C the first time around as it would
see it is on the correct version Y and can therefore fix the deployed
by version for C. This is what the check always should have been
but since most of the testing is with 2 mgr daemons and even with
more its by chance you end up in the loop this issue wasn't seen.

Will add that it is also possible to end up in this loop with
only 2 mgr daemons if some amount of manual upgrading of the mgr
daemons is done.

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit fa0ab94b40a12ffcef2e889eecbd2757f24d0811)

Merge pull request #49911 from zdover23/wip-doc-2023-01-30-special-order-backport-49908-to-pacific

pacific: doc/dev: backport 49908 to P (Upgrade Testing Docs)

doc/dev: backport 49908 to P (Upgrade Testing Docs)

Backport https://github.com/ceph/ceph/pull/49908 to Pacific.

Signed-off-by: Zac Dover <zac.dover@gmail.com>

Merge pull request #49899 from zdover23/wip-doc-2023-01-27-backport-49897-to-pacific

pacific: doc/rados/operations: Fix double prompt

Reviewed-by: Cole Mitchell <cole.mitchell@gmail.com>

doc/rados/operations: Fix double prompt

In monitoring.rst a double prompt was rendered, one non-selectable and one selectable. Remove the selectable prompt.

Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
(cherry picked from commit 7d7282c7bdf0396f9a192badeedf66976fa01e5e)

Merge pull request #49893 from zdover23/wip-doc-2023-01-27-backport-49890-to-pacific

pacific: doc/dev: use underscores in config vars

Merge pull request #49896 from zdover23/wip-doc-2023-01-27-backport-49894-to-pacific

pacific: doc/rados/operations: Fix indentation

doc/rados/operations: Fix indentation

Fix invalid indentation that caused indentation to be rendered wrong in control.rst.

Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
(cherry picked from commit f2e19d39ed303c700c6aab206a1c000c5731baeb)

doc/dev: use underscores in config vars

Use underscores instead of spaces in config vars in ceph_krb_auth.rst.

Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
(cherry picked from commit ed88f7f378fcb70be57351271aa19b30f81555d4)

Merge pull request #49852 from ceph/pacific-release

v16.2.11

Merge pull request #49875 from zdover23/wip-doc-2023-01-26-backport-49873-to-pacific

pacific: doc/dev: add Slack to Dev Guide essentials

doc/dev: add Slack to Dev Guide essentials

Add Ceph's Slack to doc/developer_guide/essentials.rst.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit c3288cf5be232bded5ccd70925da2452599903ab)

16.2.11

Signed-off-by: Ceph Release Team <ceph-maintainers@ceph.io>

Merge pull request #48933 from s0nea/wip-58042-pacific

pacific: ceph-mixing: fix ceph_hosts variable

Reviewed-by: Avan Thakkar <athakkar@redhat.com>

Merge pull request #49833 from zdover23/wip-doc-2023-01-23-backport-49778-to-pacific

pacific: doc/rados: refine ceph-conf.rst

Merge pull request #49822 from zdover23/wip-doc-2023-01-21-backport-49820-to-pacific

pacific: doc/rados: refine pool-pg-config-ref.rst

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Reviewed-by: Cole Mitchell <cole.mitchell@gmail.com>

doc/rados: refine ceph-conf.rst

Correct grammar and usage in ceph-conf.rst.

https://tracker.ceph.com/issues/58485

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit dfb0b4a6ad233ad410ab5952a6a3db89c46cc0aa)

Merge pull request #49520 from rhcs-dashboard/wip-58319-pacific

pacific: mgr/prometheus: expose daemon health metrics

Reviewed-by: Pegonzal <NOT@FOUND>
Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>

doc/rados: refine pool-pg-config-ref.rst

Remove pleonasm.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 11724c9b8e0244a0b44daa3db96022c50d603090)

Merge pull request #49805 from idryomov/wip-doc-exclusive-lock-transitions-pacific

pacific: doc/rbd/rbd-exclusive-locks: warn about automatic lock transitions

Reviewed-by: Ramana Raja <rraja@redhat.com>

doc/rbd/rbd-exclusive-locks: warn about automatic lock transitions

A lot of people aren't aware of automatic lock transitions and
wrongfully assume that exclusive lock means that the image remains
locked for as long as the client is running. Redo the explanation
and add a warning.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 2af7252b332c17e3ad8a363f13e2404d1276e2bd)

doc/rbd/rbd-exclusive-locks: don't mention "profile rbd" requirement twice

It's (much better) described in the Blocklisting section.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit c966ea99bc5d6fb6e4a3dc41becedb62efa4dd74)

Merge pull request #49793 from zdover23/wip-doc-2023-01-20-backport-49764-to-pacific

pacific: doc/ceph-volume: refine encryption.rst

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

doc/ceph-volume: refine encryption.rst

Improve the word choice and grammar of
doc/ceph-volume/lvm/encryption.rst.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 6f7f7c8f651319bc75847dfc784213d5111e6502)

Merge pull request #49786 from rhcs-dashboard/wip-58502-pacific

pacific: mgr/prometheus: export zero valued pg state metrics

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>

mgr/prometheus: export zero valued pg state metrics

Fixes: https://tracker.ceph.com/issues/58471
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
As per the Prometheus documentation, omitting zero metrics is not a best practice. The metric value for all PG_STATES should be initialized to zero.

(cherry picked from commit 17d1ecc914b2fe6c5d9e8045a999985988c39447)

Merge pull request #49782 from zdover23/wip-doc-2023-01-19-backport-49780-to-pacific

pacific: doc/install: link to "cephadm installing ceph"

Reviewed-by: Cole Mitchell <cole.mitchell@gmail.com>

doc/install: link to "cephadm installing ceph"

Link to "Installing Ceph" in the cephadm documentation instead of (as
was the case before this commit) to the cephadm overview page. Anyone
who clicks on the "cephadm" link in the context of the
doc/install/index.rst page is more likely to expect installation
instructions than to expect an explanation of what cephadm is.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit f04001deb302d98c950bff68cdecc87a65a66d4d)

Merge pull request #49758 from zdover23/wip-doc-2023-01-16-backport-49747-to-pacific

pacific: doc/ceph-volume: update LUKS docs

doc/ceph-volume: update LUKS docs

Remove references that claim that Ceph uses only LUKS version 1.

https://tracker.ceph.com/issues/58354

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 27974c1d8a413cd5035acf27978cee36c2f6ccb4)

Merge pull request #49752 from zdover23/wip-doc-2023-01-16-backport-49745-to-pacific

pacific: doc/start: add RST escape character rules for bold

Merge pull request #49750 from zdover23/wip-doc-2023-01-16-backport-49716-to-pacific

pacific: doc/rbd: format iscsi-initiator-linux.rbd better

doc/start: add RST escape character rules for bold

Explain how to escape the bold notation (**) within words in RST.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 84524c264638710dd7d00a193e25a511cd0de19d)

doc/rbd: format iscsi-initiator-linux.rbd better

Add prompts and clean up the lists in doc/rbd/iscsi-initiator-linux.rbd.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 26b8e9cb93ccaf72a5e46aae6c436be9deaf3ac5)

Merge pull request #49739 from zdover23/wip-doc-2023-01-14-backport-49736-to-pacific

pacific: doc/dev: add git branch management commands

Reviewed-by: Cole Mitchell <cole.mitchell@gmail.com>

doc/dev: add git branch management commands

Add git branch deleting and search commands to the "Basic Workflow" page
of the Developer Guide.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit f1c0c3ec3d0d56d4615a77d5912018dc0542c959)

Merge pull request #49695 from Matan-B/wip-matanb-pacific-mgr-packaging

pacific: mgr/prometheus: use vendored "packaging" instead

Reviewed-by: Laura Flores <lflores@redhat.com>

Merge pull request #49718 from zdover23/wip-doc-2023-01-11-backport-49715-to-pacific

pacific: doc/cephadm: s/osd/OSD/ where appropriate

doc/cephadm: s/osd/OSD/ where appropriate

Capitalize the initialization "OSD" where it occurs in natural language
in cephadm/host-management.rst. This PR answers a request made by
Anthony D'Atri and seconded by Cole Mitchell in https://github.com/ceph/ceph/pull/49699#discussion_r1066171002.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 559fea8b427ba60b49ebfe16d67fa21168e3e1bc)

mgr/prometheus: use vendored "packaging" instead

backport of: cf6089200d96fc56b08ee17a4e31f19823370dc8

Signed-off-by: Matan Breizman <mbreizma@redhat.com>

Merge pull request #49696 from idryomov/wip-58398-pacific

pacific: doc/man/ceph-rbdnamer: remove obsolete udev rule

Reviewed-by: Christopher Hoffman <choffman@redhat.com>

Merge pull request #49707 from zdover23/wip-doc-2023-01-11-backport-49699-to-pacific

pacific: doc/cephadm: refine "Removing Hosts"

Reviewed-by: Cole Mitchell <cole.mitchell@gmail.com>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #49705 from zdover23/wip-doc-2023-01-11-backport-49703-to-pacific

pacific: doc/rados: move colon

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Reviewed-by: Cole Mitchell <cole.mitchell@gmail.com>

doc/cephadm: refine "Removing Hosts"

An intended edit to remove a redundant indefinite article became a
longer (but still brief) full editorial pass.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 2e1fd6308726ecfdfe39c51c6e8fc99e922a3f84)

doc/rados: move colon

Move colon in add-or-rm-osds.rst so that the sentence reads properly.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit b1a53d7621241272f7f9f2f99dfd912b6117215b)

Merge pull request #49702 from zdover23/wip-doc-2023-01-11-backport-49700-to-pacific

pacific: doc/css: add top-bar padding for h3 html element

Reviewed-by: Cole Mitchell <cole.mitchell@gmail.com>

doc/css: add top-bar padding for h3 html element

Add "scroll-margin-top: 4em;" property to h3 html element.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit c28cb533909a9ac0010e160e86d2a8dc23ea9675)

Merge pull request #49694 from zdover23/wip-doc-2023-01-11-backport-49692-to-pacific

pacific: doc/css: add "span" padding to custom.css

doc/man/ceph-rbdnamer: remove obsolete udev rule

Fixes: https://tracker.ceph.com/issues/58398
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 8ca3bd5042094aa0e67d728131f8fa942919717e)

doc/css: add "span" padding to custom.css

Add "scroll-top-bar: 2em;" for the "span" html element in custom.css so
that the top bar doesn't get in the way of headings bounded by the "span
element".

See also https://github.com/ceph/ceph/pull/49644.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit dd9555b0ae0507c996588e82a62b0a674530a16a)

Merge pull request #49681 from zdover23/wip-doc-2023-01-10-backport-49677-2nd-attempt-to-pacific

pacific: doc/rados: link to cephadm replacing osd section

Reviewed-by: Cole Mitchell <cole.mitchell@gmail.com>

Merge pull request #49684 from zdover23/wip-doc-2023-01-10-backport-49663-to-pacific

pacific: doc: fix a typo

Reviewed-by: Cole Mitchell <cole.mitchell@gmail.com>

doc: fix a typo

Signed-off-by: Brad Fitzpatrick <brad@danga.com>
(cherry picked from commit c670906b49de87514e3b3cce28519c0eba7fad26)

doc/rados: link to cephadm replacing osd section

Direct readers to the "Replacing an OSD" section in the cephadm
documentation, for cases in which the instructions in "Replacing an OSD"
in the RADOS documentation don't work.

https://tracker.ceph.com/issues/58401

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 7b8f75ebd7cd29faf2eec2fba1fefa17a390b92d)

Merge pull request #49647 from adk3798/pacific-maintenance-syntax

pacific: doc/dev/cephadm: fix host maintenance enter/exit syntax -

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #49668 from zdover23/wip-doc-2023-01-09-backport-49665-to-pacific

pacific: doc/glossary: Clean up "Ceph Object Storage"

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

doc/glossary: Clean up "Ceph Object Storage"

Remove redundant material under the "Ceph Object Storage" headword and
add a "See 'Ceph Object Store'" link. A future PR will provide a couple
of sentences that explain how object storage is what's really supporting
both CephFS and RBD.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 6bede3505f2967155644b0edfe9db1a8eb82a93c)

Merge pull request #49662 from zdover23/wip-doc-2023-01-07-backport-49658-to-pacific

pacific: doc/css: Add scroll-margin-top to h2 html element

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #49660 from zdover23/wip-doc-2023-01-07-backport-49653-to-pacific

pacific: doc/man: define --num-rep, --min-rep and --max-rep

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

doc/css: Add scroll-margin-top to h2 html element

Add "scroll-margin-top: 4em;" to the h2 html element's definition in
custom.css. This moves the text under all h2 html elements out of the
way of the sticky-header-style top bar, which previously obscured the
text.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit af048ca911fcd84e6a73d32999c772f64e95d67f)

doc/man: define --num-rep, --min-rep and --max-rep

Explain the "--num-rep", "--min-rep", and "--max-rep" options, which are
required when running "crushtool" commands with the "--show-mappings"
flag. Originally reported by Brad Fitzpatrick.

https://tracker.ceph.com/issues/58374

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 10fa01d075b09726e3f07d2cae83ced4e418deae)

Merge pull request #49645 from zdover23/wip-doc-2023-01-06-backport-49643-to-pacific

pacific: doc/_static: add scroll-margin-top to custom.css

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

doc/dev/cephadm: fix host maintenance enter/exit syntax -

Signed-off-by: Ranjini Mandyam Narasiodeyar <rmandyam@rmandyam.remote.csb>
(cherry picked from commit ffea636176162c5db0a2f70e1bec9daf56ac8cfc)

doc/_static: add scroll-margin-top to custom.css

Add 4em of padding to the class "section", so that linked-to
destinations are not obscured by the top bar.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 5738271498c1d4581e44b077580f1131950d1ba3)

Merge pull request #49640 from zdover23/wip-doc-2023-01-05-backport-49637-to-pacific

pacific: doc/css: add scroll-margin-top to dt elements

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

doc/css: add scroll-margin-top to dt elements

add "scroll-margin-top: em3;" to custom.css so that the header bar
doesn't obscure the text of headwords in glossary.rst. Note that this
applies only to elements in the documentation that are rendered into
HTML with the dt (which stands for "description term" or "description
list") tag. Other modifications will be necessary in order to ensure
that the anchor points of non-dt elements are not obscured by the header
bar.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 1865de86622592555571c1539f3c013c7936d53e)

Merge pull request #49337 from ljflores/wip-tracker-54992

pacific: qa: run e2e test on centos only

Merge pull request #49622 from zdover23/wip-doc-2023-01-04-backport-49620-to-pacific

pacific: doc: fix a couple grammatical things

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

doc: fix a couple grammatical things

Signed-off-by: Brad Fitzpatrick <brad@danga.com>
(cherry picked from commit b9b6011c11450e292e4d233a444d776cca8fd86e)

Merge pull request #49616 from zdover23/wip-doc-2022-01-03-backport-49613-to-pacific

pacific: doc/start: add Anthony D'Atri's suggestions

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

doc/start: add Anthony D'Atri's suggestions

Add the suggestions made by Anthony D'Atri in
https://github.com/ceph/ceph/pull/49609.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 6141c5f0618432b5dc23d53b5920cab48aa5db7e)

Merge pull request #49611 from zdover23/wip-doc-2023-01-02-backport-49609-to-pacific

pacific: doc/start: refine "Quirks of RST"

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

doc/start: refine "Quirks of RST"

Refine the language that was added yesterday, language that explains how
certain aspects of RST work.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit cdc7b6035414ea46b1e583d21a3a821f041c5417)

Merge pull request #49607 from zdover23/wip-doc-2023-01-01-backport-49606-to-pacific

pacific: doc/start: add link-related metadocumentation

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

doc/start: add link-related metadocumentation

Add two kinds of link-related metadocumentation (documentation about how
to write documentation) to the "Documenting Ceph" section of the "Intro
to Ceph" document: 1. metadocumentation about external links, and 2.
metadocumentation about internal links.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 7517404f72819907700036cd1a287174bad38f10)

Merge pull request #49604 from zdover23/wip-doc-2022-12-31-backport-49602-to-pacific

pacific: doc/glossary: capitalize "DAS" correctly

Reviewed-by: Cole Mitchell <cole.mitchell@gmail.com>

doc/glossary: capitalize "DAS" correctly

Correctly capitalize "Direct-Attached Storage" in the glossary. (And
test the "Quincy" branch, which seems lately not to have picked up any
docs backports.)

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 7c70a66414325edf44db6183fd35a007d3e44fd9)

Merge pull request #49601 from zdover23/wip-doc-2022-12-30-backport-49599-to-pacific

pacific: doc/glossary: collate "releases" entries

Reviewed-by: Cole Mitchell <cole.mitchell@gmail.com>

doc/glossary: collate "releases" entries

Collect the "Releases"-related entries together under the "Releases"
headword, in order to give readers a sense at a glance of how the
different kinds of releases relate to one another.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 39377f7db114b344727249792de2f5d98b72c1d7)

Merge pull request #49596 from zdover23/wip-doc-2022-12-30-backport-49593-to-pacific

pacific: doc/glossary: s/an/each/ where it's needed

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #49598 from zdover23/wip-doc-2022-12-30-backport-49488-to-pacific

pacific: doc/rbd: refine rbd-exclusive-locks.rst

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

doc/rbd: refine rbd-exclusive-locks.rst

Refine grammar (mostly semantics) in rbd-exclusive-locks.rst.

Co-authored-by: Ilya Dryomov <idryomov@redhat.com>
Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 62b00127513c60a1a99b149ab4878ee11763f4fd)

doc/glossary: s/an/each/ where it's needed

s/an/each/ in accordance with the suggestion made by Anthony D'Atri
here: https://github.com/ceph/ceph/pull/49590/files#r1058390357

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 40a9f1594cf6a5d3660b53981c5c398c9b294758)

Merge pull request #49590 from zdover23/wip-doc-2022-12-28-backport-49584-to-pacific

pacific: doc/glossary: clean OSD id-related entries

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #49592 from zdover23/wip-doc-2022-12-28-backport-49587-to-pacific

pacific: doc/rbd: s/wuold/would/ in rados-rbd-cmds.rst

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

doc/rbd: s/wuold/would/ in rados-rbd-cmds.rst

s/wuold/would/ in rados-rbd-cmds.rst.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 39c18021d6cc998b44050de9d67b22d3a4fae893)

doc/glossary: clean OSD id-related entries

Tidy up the sentences under the headwords "OSD fsid", "OSD id", and "OSD
uuid".

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit e16fe735305e1d61b1635455175dd41557e13819)

Merge pull request #49470 from lxbsz/wip-58293

pacific: qa: switch to https protocol for repos' server

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

Merge pull request #49312 from kamoltat/wip-ksirivad-backport-pacific-bz-2121452

pacific: mon/Elector: Change how we handle removed_ranks and notify_rank_removed()
Reviewed by: Gregory Farnum <gfarnum@redhat.com>

qa: remove unused 'teuthology.orchestra.run' in xfstests_dev.py

Fixes: https://tracker.ceph.com/issues/58133
Signed-off-by: Xiubo Li <xiubli@redhat.com>
(cherry picked from commit 496bf662eff07dc95c8b3ff64c9753519884c1e5)

qa: switch to https protocol for ffsb and xfstests-dev repos

Since the git protocol is not reachable any more, just switch it
to https.

Fixes: https://tracker.ceph.com/issues/58133
Signed-off-by: Xiubo Li <xiubli@redhat.com>
(cherry picked from commit 4c97a9e469cfe060531be12988ff087ad2ff36c5)

Conflicts:
- qa/workunits/suites/ffsb.sh: no need to fix

qa: switch to https protocol for repos' server

Since the git:// is not reachable any more and have switch to
https://.

The git archive does not support the https protocol, so we couldn't
user the git archive to retrieve the tar ball any more, will split
this into 3 steps:

1, clone the whole ceph repo
2, checkout the commit/tag/branch
3, then change directory to qa/workunits/.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
(cherry picked from commit 89177d65988c56324916de8394089b6e4b38aab7)
Conflicts:
- qa/workunits/fs/snaps/snaptest-git-ceph.sh: minor conflicts
- qa/machine_types/schedule_subset.sh: no need to fix this
- qa/tasks/cephfs/xfstests_dev.py: minor confilicts

Merge pull request #49575 from zdover23/wip-doc-2022-12-26-backport-49573-to-pacific

pacific: doc/glossary: disambiguate clauses

Reviewed-by: Cole Mitchell <cole.mitchell@gmail.com>

doc/glossary: disambiguate clauses

Disambiguate various clauses, most of which contain forms of the verb
"to require".

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 643f430a0f82b4018b0076883a249e4556bc956a)

mon/Monitor.cc: notify_new_monmap() skips removal of non-exist rank

Problem:
In RHCS the user can choose to manually remove a monitor rank
before shutting the monitor down. Causing inconsistency in monmap.
for example we remove mon.a from the monmap, there is a short period
where mon.a is still operational and will try to remove itself from
monmap but we will run into an assertion in
ConnectionTracker::notify_ranks_removed().

Solution:
In Monitor::notify_new_monmap() we prevent the func
from going into removing our own rank, or
ranks that doesn't exists in monmap.

FYI: this is an RHCS problem only, in ODF,
we never remove a monitor from monmap
before shutting it down.

Fixes: https://tracker.ceph.com/issues/58049
Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit 924e7ec92bbaa6efd0ef816c1cb101ff7972616c)

qa/standalone/mon: remove --mon-inital-members setting

Problem:

--mon-initial-members does nothing but causes monmap
to populate ``removed_ranks`` because the way we start
monitors in standalone tests uses ``run_mon $dir $id ..``
on each mon. Regardless of --mon-initial-members=a,b,c, if
we set --mon-host=$MONA,$MONB,$MONC (which we do every single tests),
everytime we run a monitor (e.g.,run mon.b) it will pre-build
our monmap with

```
noname-a=mon.noname-a addrs v2:127.0.0.1:7127/0,
b=mon.b addrs v2:127.0.0.1:7128/0,
noname-c=mon.noname-c addrs v2:127.0.0.1:7129/0,
```

Now, with --mon-initial-members=a,b,c we are letting
monmap know that we should have initial members name:
a,b,c, which we only have `b` as a match. So what
``MonMap::set_initial_members`` do is that it will
remove noname-a and noname-c which will
populate `removed_ranks`.

Solution:

remove all instances of --mon-initial-members
in the standalone test as it has no impact on
the nature of the tests themselves.

Fixes: https://tracker.ceph.com/issues/58132
Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit e1c095dcf0a019bff01d2d8c819e5f95604c8da5)

mon: clear connection score during update & add sanity check live/dead connection report

When upgrading the monitors (include booting up),
we check if `peer_tracker` is dirty or not. If
so, we clear it. Added some functions in `Elector` and
`ConnectionTracker` class to
check for clean `peer_tracker`.

Moreover, there could be some cases where due
to startup weirdness or abnormal circumstances,
we might get a report from our own rank. Therefore,
it doesn't hurt to add a sanity check in
`ConnectionTracker::report_live_connection` and
`ConnectionTracker::report_dead_connection`.

Fixes: https://tracker.ceph.com/issues/58049
Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit 25ce77c7984587f457eba9bd06e416ef06f4e1c7)

mon/Elector & ConnectionTracker: reset peer_tracker.rank

In `notify_clear_peer_state()` we another
mechanism in reseting our `peer_tracker.rank`
to match our own monitor.rank.

This is added so there is a way for us
to recover from a scenrio where `peer_tracker.rank`
is messed up from adjusting the ranks or removing
ranks.

`notifiy_clear_peer_state()` can be triggered
by using the command:

`ceph connection scores reset`

Also in `clear_peer_reports`, besides
reassigning my_reports to an empty object,
we also have to make `my_reports` = `rank`
from `peer_tracker`, such that we don't get
-1 as a rank in my_reports.

Fixes: https://tracker.ceph.com/issues/58049
Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit 55cf717a3070d10b6b02af33a66d6ad0acbba0f6)

mon: change how we handle removed_ranks

when a new monitor joins, there is a chance that
it will recive a monmap that recently removed
a monitor and ``removed_rank`` will have some
content in it. A new monitor that joins
should never remove rank in peer_tracker but
rather call ``notify_clear_peer_state()``
to reset the `peer_report`.

In the case when it is a monitor that
has joined quorum before and is only 1
epoch behind the newest monmap provided
by the probe_replied monitor. We can
actually remove and adjust ranks in `peer_report`
since we are sure that if there is any content in
removed_ranks, then it has to be because in the
next epoch we are removing a rank, since every
update of an epoch we always clear the removed_ranks.

There is no point in keeping the content
of ``removed_ranks`` after monmap gets updated
to the epoch.

Therefore, clear ``removed_ranks`` every update.

When there is discontinuity between
monmaps for more 1 epoch or the new monitor never joined quorum before,
we always reset `peer_tracker`.

Moreover, beneficial for monitor log to also log
which rank has been removed at the current time
of the monmap. So add removed_ranks to `print_summary`
and `dump` in MonMap.cc.

Fixes: https://tracker.ceph.com/issues/58049
Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit 04402576fecf1cb97f515b5dc42261a77837e400)

Conflicts:
src/mon/Monitor.cc - trivial fix

mon/ConnectionTracker.cc: Improve notify_rank_removed()

PROBLEM:

In `ConnectionTracker::receive_peer_report`
we loop through ranks which is bad when
there is `notify_rank_removed` before this and
the ranks are not adjusted yet. When we rely
on the rank in certain scenarios, we end up
with extra peer_report copy which we don't
want.

SOLUTION:

In `ConnectionTracker::receive_peer_report`
instead of passing `report.rank` in the function
`ConnectionTracker::reports`, we pass `i.first`
instead so that trim old ranks properly.

We also added a assert in notify_rank_removed(),
comparing expected rank provided by the monmap
against the rank that we adjust ourself to as
a sanity check.

We edited test/mon/test_election.cc
to reflect the changes made in notify_rank_removed().

Fixes: https://tracker.ceph.com/issues/58049
Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit 7c52ccec76bc7e7f9678cc9d78d106e17f9ad8f7)

Conflicts:
src/mon/Elector.cc - trivial fix
src/mon/Elector.h - trivial fix