Zac Dover [Thu, 19 Jan 2023 01:50:17 +0000 (11:50 +1000)]
doc/install: link to "cephadm installing ceph"
Link to "Installing Ceph" in the cephadm documentation instead of (as
was the case before this commit) to the cephadm overview page. Anyone
who clicks on the "cephadm" link in the context of the
doc/install/index.rst page is more likely to expect installation
instructions than to expect an explanation of what cephadm is.
Aashish Sharma [Tue, 17 Jan 2023 09:56:34 +0000 (15:26 +0530)]
mgr/dashboard: fix bucket encryption checkbox
Fixes: https://tracker.ceph.com/issues/58474
The encryption checkbox in the bucket creation form remains disabled after setting the vault authentication method as agent.
Zac Dover [Wed, 11 Jan 2023 15:12:24 +0000 (01:12 +1000)]
doc/cephadm: s/osd/OSD/ where appropriate
Capitalize the initialization "OSD" where it occurs in natural language
in cephadm/host-management.rst. This PR answers a request made by
Anthony D'Atri and seconded by Cole Mitchell in https://github.com/ceph/ceph/pull/49699#discussion_r1066171002.
Zac Dover [Tue, 10 Jan 2023 15:55:55 +0000 (01:55 +1000)]
doc/css: add "span" padding to custom.css
Add "scroll-top-bar: 2em;" for the "span" html element in custom.css so
that the top bar doesn't get in the way of headings bounded by the "span
element".
Zac Dover [Mon, 9 Jan 2023 18:09:20 +0000 (04:09 +1000)]
doc/rados: link to cephadm replacing osd section
Direct readers to the "Replacing an OSD" section in the cephadm
documentation, for cases in which the instructions in "Replacing an OSD"
in the RADOS documentation don't work.
Zac Dover [Sun, 8 Jan 2023 08:04:43 +0000 (18:04 +1000)]
doc/glossary: Clean up "Ceph Object Storage"
Remove redundant material under the "Ceph Object Storage" headword and
add a "See 'Ceph Object Store'" link. A future PR will provide a couple
of sentences that explain how object storage is what's really supporting
both CephFS and RBD.
Zac Dover [Fri, 6 Jan 2023 16:24:39 +0000 (02:24 +1000)]
doc/css: Add scroll-margin-top to h2 html element
Add "scroll-margin-top: 4em;" to the h2 html element's definition in
custom.css. This moves the text under all h2 html elements out of the
way of the sticky-header-style top bar, which previously obscured the
text.
Zac Dover [Fri, 6 Jan 2023 12:51:47 +0000 (22:51 +1000)]
doc/man: define --num-rep, --min-rep and --max-rep
Explain the "--num-rep", "--min-rep", and "--max-rep" options, which are
required when running "crushtool" commands with the "--show-mappings"
flag. Originally reported by Brad Fitzpatrick.
Zac Dover [Thu, 5 Jan 2023 12:25:43 +0000 (22:25 +1000)]
doc/css: add scroll-margin-top to dt elements
add "scroll-margin-top: em3;" to custom.css so that the header bar
doesn't obscure the text of headwords in glossary.rst. Note that this
applies only to elements in the documentation that are rendered into
HTML with the dt (which stands for "description term" or "description
list") tag. Other modifications will be necessary in order to ensure
that the anchor points of non-dt elements are not obscured by the header
bar.
Zac Dover [Sun, 1 Jan 2023 12:06:54 +0000 (22:06 +1000)]
doc/start: add link-related metadocumentation
Add two kinds of link-related metadocumentation (documentation about how
to write documentation) to the "Documenting Ceph" section of the "Intro
to Ceph" document: 1. metadocumentation about external links, and 2.
metadocumentation about internal links.
Zac Dover [Sat, 31 Dec 2022 04:22:26 +0000 (14:22 +1000)]
doc/glossary: capitalize "DAS" correctly
Correctly capitalize "Direct-Attached Storage" in the glossary. (And
test the "Quincy" branch, which seems lately not to have picked up any
docs backports.)
Zac Dover [Fri, 30 Dec 2022 01:32:31 +0000 (11:32 +1000)]
doc/glossary: collate "releases" entries
Collect the "Releases"-related entries together under the "Releases"
headword, in order to give readers a sense at a glance of how the
different kinds of releases relate to one another.
Ilya Dryomov [Thu, 22 Dec 2022 15:32:44 +0000 (16:32 +0100)]
qa: switch to curl for qemu-xfstests
This is a follow-up for commit 631899ffeb84 ("qa: switch back to git
protocol for qemu-xfstests"), needed for the same "ancient execution
environment" reason.
Ilya Dryomov [Mon, 19 Dec 2022 17:54:08 +0000 (18:54 +0100)]
qa: switch back to git protocol for qemu-xfstests
As noted in commit 89177d65988c ("qa: switch to https protocol for
repos' server"), git.ceph.com mirror doesn't make git:// available
anymore. However, run_xfstests-obsolete.sh has "obsolete" in its
name for a reason -- due to an ancient execution environment, git://
is the only viable option:
$ git clone https://git.ceph.com/xfstests-dev.git
Cloning into 'xfstests-dev'...
error: gnutls_handshake() failed: A TLS fatal alert has been received. while accessing https://git.ceph.com/xfstests-dev.git/info/refs
fatal: HTTP request failed
Kamoltat [Wed, 14 Dec 2022 19:54:00 +0000 (19:54 +0000)]
mon/Monitor.cc: notify_new_monmap() skips removal of non-exist rank
Problem:
In RHCS the user can choose to manually remove a monitor rank
before shutting the monitor down. Causing inconsistency in monmap.
for example we remove mon.a from the monmap, there is a short period
where mon.a is still operational and will try to remove itself from
monmap but we will run into an assertion in
ConnectionTracker::notify_ranks_removed().
Solution:
In Monitor::notify_new_monmap() we prevent the func
from going into removing our own rank, or
ranks that doesn't exists in monmap.
FYI: this is an RHCS problem only, in ODF,
we never remove a monitor from monmap
before shutting it down.
--mon-initial-members does nothing but causes monmap
to populate ``removed_ranks`` because the way we start
monitors in standalone tests uses ``run_mon $dir $id ..``
on each mon. Regardless of --mon-initial-members=a,b,c, if
we set --mon-host=$MONA,$MONB,$MONC (which we do every single tests),
everytime we run a monitor (e.g.,run mon.b) it will pre-build
our monmap with
Now, with --mon-initial-members=a,b,c we are letting
monmap know that we should have initial members name:
a,b,c, which we only have `b` as a match. So what
``MonMap::set_initial_members`` do is that it will
remove noname-a and noname-c which will
populate `removed_ranks`.
Solution:
remove all instances of --mon-initial-members
in the standalone test as it has no impact on
the nature of the tests themselves.
When upgrading the monitors (include booting up),
we check if `peer_tracker` is dirty or not. If
so, we clear it. Added some functions in `Elector` and
`ConnectionTracker` class to
check for clean `peer_tracker`.
Moreover, there could be some cases where due
to startup weirdness or abnormal circumstances,
we might get a report from our own rank. Therefore,
it doesn't hurt to add a sanity check in
`ConnectionTracker::report_live_connection` and
`ConnectionTracker::report_dead_connection`.
In `notify_clear_peer_state()` we another
mechanism in reseting our `peer_tracker.rank`
to match our own monitor.rank.
This is added so there is a way for us
to recover from a scenrio where `peer_tracker.rank`
is messed up from adjusting the ranks or removing
ranks.
`notifiy_clear_peer_state()` can be triggered
by using the command:
`ceph connection scores reset`
Also in `clear_peer_reports`, besides
reassigning my_reports to an empty object,
we also have to make `my_reports` = `rank`
from `peer_tracker`, such that we don't get
-1 as a rank in my_reports.
Kamoltat [Wed, 2 Nov 2022 01:59:52 +0000 (01:59 +0000)]
mon: change how we handle removed_ranks
when a new monitor joins, there is a chance that
it will recive a monmap that recently removed
a monitor and ``removed_rank`` will have some
content in it. A new monitor that joins
should never remove rank in peer_tracker but
rather call ``notify_clear_peer_state()``
to reset the `peer_report`.
In the case when it is a monitor that
has joined quorum before and is only 1
epoch behind the newest monmap provided
by the probe_replied monitor. We can
actually remove and adjust ranks in `peer_report`
since we are sure that if there is any content in
removed_ranks, then it has to be because in the
next epoch we are removing a rank, since every
update of an epoch we always clear the removed_ranks.
There is no point in keeping the content
of ``removed_ranks`` after monmap gets updated
to the epoch.
Therefore, clear ``removed_ranks`` every update.
When there is discontinuity between
monmaps for more 1 epoch or the new monitor never joined quorum before,
we always reset `peer_tracker`.
Moreover, beneficial for monitor log to also log
which rank has been removed at the current time
of the monmap. So add removed_ranks to `print_summary`
and `dump` in MonMap.cc.
In `ConnectionTracker::receive_peer_report`
we loop through ranks which is bad when
there is `notify_rank_removed` before this and
the ranks are not adjusted yet. When we rely
on the rank in certain scenarios, we end up
with extra peer_report copy which we don't
want.
SOLUTION:
In `ConnectionTracker::receive_peer_report`
instead of passing `report.rank` in the function
`ConnectionTracker::reports`, we pass `i.first`
instead so that trim old ranks properly.
We also added a assert in notify_rank_removed(),
comparing expected rank provided by the monmap
against the rank that we adjust ourself to as
a sanity check.
We edited test/mon/test_election.cc
to reflect the changes made in notify_rank_removed().
Pere Diaz Bou [Fri, 11 Nov 2022 09:43:01 +0000 (10:43 +0100)]
mgr/prometheus: expose daemon health metrics
Until now daemon health metrics were stored without being used. One of
the most helpful metrics there is SLOW_OPS with respect to OSDs and MONs
which this commit tries to expose to bring fine grained metrics to find
troublesome OSDs instead of having a lone healthcheck of slow ops in the
whole cluster.