Ilya Dryomov [Thu, 19 Jan 2023 12:21:40 +0000 (13:21 +0100)]
doc/rbd/rbd-exclusive-locks: warn about automatic lock transitions
A lot of people aren't aware of automatic lock transitions and
wrongfully assume that exclusive lock means that the image remains
locked for as long as the client is running. Redo the explanation
and add a warning.
Avan Thakkar [Mon, 16 Jan 2023 12:41:06 +0000 (18:11 +0530)]
mgr/prometheus: export zero valued pg state metrics
Fixes: https://tracker.ceph.com/issues/58471 Signed-off-by: Avan Thakkar <athakkar@redhat.com>
As per the Prometheus documentation, omitting zero metrics is not a best practice. The metric value for all PG_STATES should be initialized to zero.
Zac Dover [Thu, 19 Jan 2023 01:50:17 +0000 (11:50 +1000)]
doc/install: link to "cephadm installing ceph"
Link to "Installing Ceph" in the cephadm documentation instead of (as
was the case before this commit) to the cephadm overview page. Anyone
who clicks on the "cephadm" link in the context of the
doc/install/index.rst page is more likely to expect installation
instructions than to expect an explanation of what cephadm is.
Zac Dover [Wed, 11 Jan 2023 15:12:24 +0000 (01:12 +1000)]
doc/cephadm: s/osd/OSD/ where appropriate
Capitalize the initialization "OSD" where it occurs in natural language
in cephadm/host-management.rst. This PR answers a request made by
Anthony D'Atri and seconded by Cole Mitchell in https://github.com/ceph/ceph/pull/49699#discussion_r1066171002.
Zac Dover [Tue, 10 Jan 2023 15:55:55 +0000 (01:55 +1000)]
doc/css: add "span" padding to custom.css
Add "scroll-top-bar: 2em;" for the "span" html element in custom.css so
that the top bar doesn't get in the way of headings bounded by the "span
element".
Zac Dover [Mon, 9 Jan 2023 18:09:20 +0000 (04:09 +1000)]
doc/rados: link to cephadm replacing osd section
Direct readers to the "Replacing an OSD" section in the cephadm
documentation, for cases in which the instructions in "Replacing an OSD"
in the RADOS documentation don't work.
Zac Dover [Sun, 8 Jan 2023 08:04:43 +0000 (18:04 +1000)]
doc/glossary: Clean up "Ceph Object Storage"
Remove redundant material under the "Ceph Object Storage" headword and
add a "See 'Ceph Object Store'" link. A future PR will provide a couple
of sentences that explain how object storage is what's really supporting
both CephFS and RBD.
Zac Dover [Fri, 6 Jan 2023 16:24:39 +0000 (02:24 +1000)]
doc/css: Add scroll-margin-top to h2 html element
Add "scroll-margin-top: 4em;" to the h2 html element's definition in
custom.css. This moves the text under all h2 html elements out of the
way of the sticky-header-style top bar, which previously obscured the
text.
Zac Dover [Fri, 6 Jan 2023 12:51:47 +0000 (22:51 +1000)]
doc/man: define --num-rep, --min-rep and --max-rep
Explain the "--num-rep", "--min-rep", and "--max-rep" options, which are
required when running "crushtool" commands with the "--show-mappings"
flag. Originally reported by Brad Fitzpatrick.
Zac Dover [Thu, 5 Jan 2023 12:25:43 +0000 (22:25 +1000)]
doc/css: add scroll-margin-top to dt elements
add "scroll-margin-top: em3;" to custom.css so that the header bar
doesn't obscure the text of headwords in glossary.rst. Note that this
applies only to elements in the documentation that are rendered into
HTML with the dt (which stands for "description term" or "description
list") tag. Other modifications will be necessary in order to ensure
that the anchor points of non-dt elements are not obscured by the header
bar.
Zac Dover [Sun, 1 Jan 2023 12:06:54 +0000 (22:06 +1000)]
doc/start: add link-related metadocumentation
Add two kinds of link-related metadocumentation (documentation about how
to write documentation) to the "Documenting Ceph" section of the "Intro
to Ceph" document: 1. metadocumentation about external links, and 2.
metadocumentation about internal links.
Zac Dover [Sat, 31 Dec 2022 04:22:26 +0000 (14:22 +1000)]
doc/glossary: capitalize "DAS" correctly
Correctly capitalize "Direct-Attached Storage" in the glossary. (And
test the "Quincy" branch, which seems lately not to have picked up any
docs backports.)
Zac Dover [Fri, 30 Dec 2022 01:32:31 +0000 (11:32 +1000)]
doc/glossary: collate "releases" entries
Collect the "Releases"-related entries together under the "Releases"
headword, in order to give readers a sense at a glance of how the
different kinds of releases relate to one another.
Xiubo Li [Wed, 23 Nov 2022 05:24:38 +0000 (13:24 +0800)]
qa: switch to https protocol for repos' server
Since the git:// is not reachable any more and have switch to
https://.
The git archive does not support the https protocol, so we couldn't
user the git archive to retrieve the tar ball any more, will split
this into 3 steps:
1, clone the whole ceph repo
2, checkout the commit/tag/branch
3, then change directory to qa/workunits/.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
(cherry picked from commit 89177d65988c56324916de8394089b6e4b38aab7)
Conflicts:
- qa/workunits/fs/snaps/snaptest-git-ceph.sh: minor conflicts
- qa/machine_types/schedule_subset.sh: no need to fix this
- qa/tasks/cephfs/xfstests_dev.py: minor confilicts
Kamoltat [Wed, 14 Dec 2022 19:54:00 +0000 (19:54 +0000)]
mon/Monitor.cc: notify_new_monmap() skips removal of non-exist rank
Problem:
In RHCS the user can choose to manually remove a monitor rank
before shutting the monitor down. Causing inconsistency in monmap.
for example we remove mon.a from the monmap, there is a short period
where mon.a is still operational and will try to remove itself from
monmap but we will run into an assertion in
ConnectionTracker::notify_ranks_removed().
Solution:
In Monitor::notify_new_monmap() we prevent the func
from going into removing our own rank, or
ranks that doesn't exists in monmap.
FYI: this is an RHCS problem only, in ODF,
we never remove a monitor from monmap
before shutting it down.
--mon-initial-members does nothing but causes monmap
to populate ``removed_ranks`` because the way we start
monitors in standalone tests uses ``run_mon $dir $id ..``
on each mon. Regardless of --mon-initial-members=a,b,c, if
we set --mon-host=$MONA,$MONB,$MONC (which we do every single tests),
everytime we run a monitor (e.g.,run mon.b) it will pre-build
our monmap with
Now, with --mon-initial-members=a,b,c we are letting
monmap know that we should have initial members name:
a,b,c, which we only have `b` as a match. So what
``MonMap::set_initial_members`` do is that it will
remove noname-a and noname-c which will
populate `removed_ranks`.
Solution:
remove all instances of --mon-initial-members
in the standalone test as it has no impact on
the nature of the tests themselves.
When upgrading the monitors (include booting up),
we check if `peer_tracker` is dirty or not. If
so, we clear it. Added some functions in `Elector` and
`ConnectionTracker` class to
check for clean `peer_tracker`.
Moreover, there could be some cases where due
to startup weirdness or abnormal circumstances,
we might get a report from our own rank. Therefore,
it doesn't hurt to add a sanity check in
`ConnectionTracker::report_live_connection` and
`ConnectionTracker::report_dead_connection`.
In `notify_clear_peer_state()` we another
mechanism in reseting our `peer_tracker.rank`
to match our own monitor.rank.
This is added so there is a way for us
to recover from a scenrio where `peer_tracker.rank`
is messed up from adjusting the ranks or removing
ranks.
`notifiy_clear_peer_state()` can be triggered
by using the command:
`ceph connection scores reset`
Also in `clear_peer_reports`, besides
reassigning my_reports to an empty object,
we also have to make `my_reports` = `rank`
from `peer_tracker`, such that we don't get
-1 as a rank in my_reports.
Kamoltat [Wed, 2 Nov 2022 01:59:52 +0000 (01:59 +0000)]
mon: change how we handle removed_ranks
when a new monitor joins, there is a chance that
it will recive a monmap that recently removed
a monitor and ``removed_rank`` will have some
content in it. A new monitor that joins
should never remove rank in peer_tracker but
rather call ``notify_clear_peer_state()``
to reset the `peer_report`.
In the case when it is a monitor that
has joined quorum before and is only 1
epoch behind the newest monmap provided
by the probe_replied monitor. We can
actually remove and adjust ranks in `peer_report`
since we are sure that if there is any content in
removed_ranks, then it has to be because in the
next epoch we are removing a rank, since every
update of an epoch we always clear the removed_ranks.
There is no point in keeping the content
of ``removed_ranks`` after monmap gets updated
to the epoch.
Therefore, clear ``removed_ranks`` every update.
When there is discontinuity between
monmaps for more 1 epoch or the new monitor never joined quorum before,
we always reset `peer_tracker`.
Moreover, beneficial for monitor log to also log
which rank has been removed at the current time
of the monmap. So add removed_ranks to `print_summary`
and `dump` in MonMap.cc.
In `ConnectionTracker::receive_peer_report`
we loop through ranks which is bad when
there is `notify_rank_removed` before this and
the ranks are not adjusted yet. When we rely
on the rank in certain scenarios, we end up
with extra peer_report copy which we don't
want.
SOLUTION:
In `ConnectionTracker::receive_peer_report`
instead of passing `report.rank` in the function
`ConnectionTracker::reports`, we pass `i.first`
instead so that trim old ranks properly.
We also added a assert in notify_rank_removed(),
comparing expected rank provided by the monmap
against the rank that we adjust ourself to as
a sanity check.
We edited test/mon/test_election.cc
to reflect the changes made in notify_rank_removed().