Adam King [Sun, 15 Jan 2023 22:18:47 +0000 (17:18 -0500)]
mgr/cephadm: fix haproxy nfs backend server ip gathering
Fixes: https://tracker.ceph.com/issues/58465
Previously, if there were 2 nfs daemons of the same
rank, we could not check the rank generation, which
is intended to mark which one is the "real" on of that
rank in cases where we cannot remove the other one due
to its host being offline. The nfs of a given rank with
the highest rank_generation is the one we want haproxy
to use for its backend IP. Since we didn't actually
check this, it was random, depending on what order we
happened to iterate over the nfs daemons of the same
rank, which IP we actually got. If the nfs with the
lower rank_generation on an offline host happened
to come later in the iterations, we'd use that one
for the IP, which is incorrect.
Adam King [Sun, 15 Jan 2023 21:30:53 +0000 (16:30 -0500)]
mgr/cephadm: don't attempt daemon actions for daemons on offline hosts
They'll just fail anyway, and it will waste time waiting
for the connection to timeout. We have other places in
the serve loop that will check if the host is back
online.
Nizamudeen A [Thu, 9 Mar 2023 11:51:44 +0000 (17:21 +0530)]
mgr/dashboard: custom image for kcli bootstrap script
the stable branches like quincy pulls from the quay.io/ceph/ceph:v17 to
bootstrap the ceph cluster in test environments. This will cause issues
because the branches are changing constantly but the image is not. So
using the quay.ceph.io repo to bring the cluster in test environment.
Matan Breizman [Thu, 15 Dec 2022 17:05:15 +0000 (17:05 +0000)]
mon/OSDMonitor: Skip check_pg_num on pool size decrease
When changing the pool size we use check_pg_num to not exceed
`mon_max_pg_per_osd` value. This check should only be applied
when increasing the size to avoid underflows.
(Same already applied when changing pg_num)
Matan Breizman [Mon, 19 Dec 2022 09:58:06 +0000 (09:58 +0000)]
mon/OSDMointor: Simplify check_pg_num()
* See: https://tracker.ceph.com/issues/47062.
Originally check_pg_num did not take into account the root
osds by the crash rule.
This behavior resulted in an inaccurate pg num per osd count.
* Avoid summing all of the projecetd pg num and only later
on subtracting the pg num if the pool did exist.
* With this change, we only count the projected pg num which
are part the pools affected by the crush rule.
Same for osd number, instead of dividing the projected
pg number by all of the osdmap osds, divide only by
the osds used by the crush rule.
* Avoid differentiating between whether the mapping epoch
is later than the osdmap epoch or not. Always check the pg
num according to crush rule.
Anthony D'Atri [Thu, 1 Dec 2022 19:04:30 +0000 (14:04 -0500)]
src/mon: clarify pool creation failure due to max_pgs_per_osd error message
Signed-off-by: Anthony D'Atri <anthony.datri@gmail.com>
Note: This commit is cherry-picked as a dependency
for later commits in this backport.
(cherry picked from commit 88e8eeca7571fc314bc30a52cd17218fa9fac500)
Tongliang Deng [Fri, 31 Dec 2021 06:02:25 +0000 (14:02 +0800)]
mon/OSDMonitor: fix integer underflow of check_pg_num
Underflow of the `uint64_t projected` variable occurs when
the sum of current acting pg num and new pg num we specified
is less than the pg num calculated from pg info.
Signed-off-by: Tongliang Deng <dengtongliang@gmail.com>
Note: This commit is cherry-picked as a dependency
for later commits in this backport.
(cherry picked from commit bd9813f5e1a3addca1a57360d58b50b120e0e5f3)
jerryluo [Mon, 25 Jan 2021 16:10:57 +0000 (00:10 +0800)]
mon/OSDMonitor: Make the pg_num check more accurate
In check_pg_num function, finding the corresponding osd according to the
current pool's crush rule, and calculating whether the average value of
pg_num on these osd will exceed the value of 'mon_max_pg_per_osd'. Make
the pg_num check more accurate by counting all the pgs on the osd used
by the new pool.
Fixes: https://tracker.ceph.com/issues/47062 Signed-off-by: Jerry Luo <luojierui@chinatelecom.cn>
Note: This commit has been reverted and is cherry-picked as
dependency for other commits in this backport.
(cherry picked from commit c726ce9e5088b30d29e0db5c0ecc8c03fe41da1d)
Zac Dover [Sat, 21 Jan 2023 16:32:59 +0000 (02:32 +1000)]
doc/install: refine index.rst
Refine English sentences in doc/install/index.rst. Remove adverbial
phrases of time that refer to Nautilus-era features as "new", since that
was four years ago.
Zac Dover [Wed, 8 Mar 2023 01:52:12 +0000 (11:52 +1000)]
doc/install: update index.rst
Update index.rst by making minor grammar improvements. This file was
long overdue for a backport to Reef, Quincy, and Pacific, so this commit
was a good way to pass a human eyeball over the text before making those
backports.
Prashant D [Tue, 30 Aug 2022 07:29:24 +0000 (03:29 -0400)]
mon/LogMonitor: Fix log last
The ceph log last command outputs all the cluster
logs generated from logm entries at DBG level,
irrespective of their log level. We must output
cluster logs generated from logm according
to the log level specified in the log last command.
Fixes: https://tracker.ceph.com/issues/57340 Signed-off-by: Prashant D <pdhange@redhat.com>
(cherry picked from commit 32e40328fbdece9f6c573c11305ee525823e53c6)
Conflicts:
src/pybind/mgr/dashboard/frontend/package-lock.json
- Generate a new one
src/pybind/mgr/dashboard/frontend/src/app/ceph/rgw/rgw-config-modal/rgw-config-modal.component.html
- Accept the current changes
Ilya Dryomov [Thu, 16 Feb 2023 11:53:02 +0000 (12:53 +0100)]
qa/workunits/rbd-nbd: work around "rbd feature disable" hang
"rbd feature disable" appears to reliably hang if the corresponding
remote request is proxied to rbd-nbd (because rbd-nbd happens to own
the exclusive lock after a series of blkdiscard calls) [1]. Work
around it here by enabling journaling before the image is mapped
and disabling it after the image is unmapped.
Also, don't assert on the output of "rbd journal inspect --verbose"
having a certain number of entries. This is racy: if the script gets
delayed after the last blkdiscard call for some reason, there may be
fewer entries present in the journal or none at all.
Ilya Dryomov [Thu, 16 Feb 2023 11:51:04 +0000 (12:51 +0100)]
test/librbd: add LengthModifiedDiscardJournalAppendEnabled test
Currently nothing triggers the length_modified case in
ImageDiscardRequest::prune_object_extents() in isolation. It's only
triggered in DiscardGranularityJournalAppendEnabled test together with
the prune_required case and a bad refactoring could easily break the
length_modified logic again.
Josef Johansson [Mon, 2 Jan 2023 13:12:53 +0000 (14:12 +0100)]
librbd: Fix local rbd mirror journals growing forever
This commit fixes commit 7ca1bab90f3 by pushing properly aligned
discards back to m_image_extents, if corrected.
If discards are misaligned (off 0, len 4608, gran=4096), they are
corrected properly, but only in object_extents and not in
m_image_extents.
When journal_append_event is triggered it will only append from
m_image_extents and does not now about the alignment fixes. In
commit_io_events_extent it will log a message and return without
completing the io since the larger misaligned area was sent to the journal.
This will in turn break rbd journal mirroring since the local client will wait
indefinately on the commit to be completed, which it never does.
This does not effect rbd-mirror in any way, which may be confusing and
dangerous since it's only rbd-mirror that updates ceph health, and not
the local client.
Setting `rbd_skip_partial_discard = false` under client will restore the
pre 7ca1bab behaviour and thus not trigger the bug with journals growing.
This will set `rbd_discard_granularity_bytes = 0` internally. This
setting is only changed during startup of a client.
Fixes: 7ca1bab90f3db3aaaa4cdbfc1f18e9f5cfbf5568 Fixes: https://tracker.ceph.com/issues/57396 Signed-off-by: Josef Johansson <josef@oderland.se>
(cherry picked from commit 21a26a752843295ff946d1543c2f5f9fac764593)
Conflicts:
src/librbd/io/ImageRequest.cc [ commit b2c88820923e ("librbd:
return area from extents_to_file()") not in quincy ]
src/test/librbd/io/test_mock_ImageRequest.cc [ commit b9a2384cdc43 ("librbd: propagate area down to
file_to_extents()") not in quincy ]
Adam King [Mon, 9 Jan 2023 19:50:12 +0000 (14:50 -0500)]
mgr/cephadm: fix extra container/entrypoint args with spaces
Fixes: https://tracker.ceph.com/issues/57338
Prior, doing extra container args like
- "--cpus"
- "2"
would work fine as the two args would be passed separately and
eventually placed in the final podman/docker run command
with a space between them. However, trying to do something like
- "--cpus 2"
instead would fail, as it would be translated to
--extra-container-args=--cpus 2
causing "2" to be considered its own arg, which cephadm
wouldn't know how to handle. Another way this can cause problems
is listed in the linked tracker. Either way, leaving the spaces
in the args was causing problems, and the simplest way to handle
it seems to be to just split on the original arg on the spaces
into multiple args
Zac Dover [Thu, 2 Mar 2023 18:04:30 +0000 (04:04 +1000)]
doc/radosgw: format admonitions
Break up the text of two similar admonitions into three paragraphs (in
each of the two instances). This makes the content of the admonition
much easier to read at a glance.
Adam King [Wed, 1 Mar 2023 21:10:41 +0000 (16:10 -0500)]
doc/cephadm: update cephadm compatability and stability page
This page is very out of date. This commit probably doesn't
cover everything there is to say about stability and compatability
in cephadm, but it at least gets it noticeably closer to reality
Ilya Dryomov [Mon, 6 Feb 2023 16:56:00 +0000 (17:56 +0100)]
mon/MgrMap: dump last_failure_osd_epoch and active_clients at top level
Currently last_failure_osd_epoch and active_clients are dumped in the
always_on_modules dictionary in "ceph mgr dump" output. This goes back
to when these fields were added in commits f2986a4400bb ("mon/MgrMonitor:
blacklist previous instance") and df507cde8d71 ("mgr: forward RADOS
client instances for potential blacklist") but is wrong as these fields
have nothing to do with always-on modules.
Conflicts:
src/pybind/mgr/dashboard/frontend/package.json
- Accept the incoming changes
src/pybind/mgr/dashboard/frontend/package-lock.json
- Regenerate a new lock file
Conflicts:
- src/pybind/mgr/dashboard/frontend/cypress/integration/block/mirroring.e2e-spec.ts
Accept the current change (because the PR that introduced the change
is not in quincy)
- package-lock.json
Generate a new package-lock.json
bryanmontalvan [Mon, 27 Jun 2022 19:43:58 +0000 (15:43 -0400)]
mgr/dashboard: Simplified silence-form matchers list
This commit removes unmeaning icons on the matchers-list component, and
now only displays the information/content needed when viewing and editing
matchers.