David Galloway [Wed, 31 Aug 2022 18:21:16 +0000 (14:21 -0400)]
.github: Give folks 30 seconds to fill out the checklist
Otherwise GitHub sends an annoying e-mail right away when you file a PR that doesn't have the checklist filled out. It's easier IMO to create the PR, then check the boxes instead of putting Xes in brackets while filling out the PR comment.
Signed-off-by: David Galloway <dgallowa@redhat.com>
RGW - Zipper - Remove a number of casts from rgw_admin
There are still a ton of casts to RadosStore in rgw_admin. Remove the
easy ones. Many of the rest represent actual operations that are
specific to RadosStore, and need to be split out.
Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
J. Eric Ivancich [Tue, 23 Aug 2022 20:44:24 +0000 (16:44 -0400)]
rgw: remove dout_subsys defs from header files
Each compilation unit should be able to define its own dout_subsys
without generating a redefinition warning. When dout_subsys is defined
in header files, it complicates this matter. This commit removes
definitions and header files and makes sure definitions are added to
.cc files as needed.
Additionally, at Adam Emerson's suggestion, use "static constexpr"
rather than "#define" to set "dout_subsys" in a few places as a
reminder to ultimately do it more broadly.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
Ilya Dryomov [Tue, 30 Aug 2022 09:45:44 +0000 (11:45 +0200)]
rbd-mirror: skip setting error code on snapshot replayer shutdown
This is regarding failures in unregister_remote_update_watcher() and
unregister_local_update_watcher(). handle_replay_complete() can't be
called in these cases anymore as it would blindly attempt to unregister
watchers from scratch again. Dropping handle_replay_complete() calls
there means that these failures would only be logged and would not be
surfaced by snapshot replayer. But the only caller ignores them
anyway:
void ImageReplayer<I>::shut_down(int r) {
...
// close the replayer
if (m_replayer != nullptr) {
ctx = new LambdaContext([this, ctx](int r) {
m_replayer->destroy();
m_replayer = nullptr;
ctx->complete(0); <------
});
ctx = new LambdaContext([this, ctx](int r) {
m_replayer->shut_down(ctx);
});
}
Ilya Dryomov [Wed, 24 Aug 2022 10:56:31 +0000 (12:56 +0200)]
rbd-mirror: resume pending shutdown on error in snapshot replayer
If a shutdown is requested, e.g. by update_pool_replayers() because
remote RADOS instance got blocklisted, and Replayer::shut_down() pends
it on completion of current snapshot sync, it gets stuck if replayer
encounters an error in the interim. This is particularly likely in the
blocklist case: a higher layer may detect that client got blocklisted
and request a shutdown first, and then when replayer sees EBLOCKLISTED
in turn, it calls handle_replay_complete() -- which does not resume
a pending shutdown. Because update_pool_replayers() blocks on shutdown
with Mirror::m_lock held, eventually the entire daemon hangs in
perpetuity.
The addition of unselectable prompts to these three files
completes the work begun in PR#47810 (d8064b4), which sought
to bring dashboard.rst into line with the unselectable prompt
standard introduced by Kefu Chai in 2020.
Ronen Friedman [Thu, 18 Aug 2022 15:27:47 +0000 (18:27 +0300)]
common: improving fmtlib handling of ceph::utime_t
1. fixing the output to show local-time instead of UTC format, matching
operator<<() handling (and all the rest of our logs)
2. adding a 'short' mode (as {:s}) for when, e.g. in most scrub logs,
we only need 3 digits for the sub-second, and do not need the
trailing TZ designation.
so we can use the formatter defined for `LogEntry` in fmtlib v9.
in this new version of fmtlib, it is required to define a specialization
for the formatted type even when it comes to the types with an override of
operator<<(). since we already have an override for `LogEntry`, let's define
the specialization for `fmt::formatter<LogEntry>`.
this change should address the FTBFS when building with fmtlib v9.
Kefu Chai [Sat, 27 Aug 2022 02:27:01 +0000 (10:27 +0800)]
common/Journald: include msg/msg_fmt.h
so we can use the formatter defined for `entity_name_t`. in fmtlib v9,
it is required to define a specialization for the formatted type even
the type has an override of operator<<(). now that we already have a
formatter for `entity_name_t`, let's just use it.
this change should address the FTBFS when building with fmtlib v9.
Ilya Dryomov [Sat, 27 Aug 2022 09:09:00 +0000 (11:09 +0200)]
librbd: use actual monitor addresses when creating a peer bootstrap token
Relying on mon_host config option is fragile, as the user may confuse
v1 and v2 addresses, group them incorrectly, etc. Get mon_host value
only as a fallback.
Kefu Chai [Sat, 27 Aug 2022 15:46:00 +0000 (23:46 +0800)]
mon/MgrMonitor: do not propse again for "mgr fail"
in 23c3f76018b446fb77bbd71fdd33bddfbae9e06d, the change to fail the mgr
is proposed immediately. but `MgrMonitor::prepare_command()` method still
returns `true` in this case. its indirect caller of
`PaxosService::dispatch()` considers this as a sign that it needs to
propose the change with `propose_pending()`. but the pending change has
already been proposed by `MgrMonitor::prepare_command()`, and
`have_pending` is also cleared by this call. as we don't allow
consecutive paxos proposals, the second `propose_pending()` call is
delayed with a configured latency. but when the timer is fired, this
poseponed call would find itself trying to propose nothing. the change
to fail the mgr has been proposed. that's why we have
`ceph_assert(have_pending)` assertion failures.
in this change, the second proposal is not proposed anymore if the
proposal is proposed immediately. this should avoid the assertion
failure.
Kefu Chai [Sat, 27 Aug 2022 01:51:02 +0000 (09:51 +0800)]
cmake: set CMP0135 policy
so the `DOWNLOAD_EXTRACT_TIMESTAMP` property of
`ExternalProject_Add()` command is set by default on CMake v3.24 and up.
it helps to set the a more accurate timestamp for the downloaded
content, hence the targets depending on the extracted content can be
rebuilt if the URL changes.
see also https://cmake.org/cmake/help/latest/policy/CMP0135.html
Adam King [Thu, 25 Aug 2022 16:09:49 +0000 (12:09 -0400)]
mgr/orchestrator/tests: don't match exact whitespace in table output
It seems that the exact spacing may differ a bit between
python versions. Currently seeing py3 (which cooresponds to py 3.6
on my system) passing these tests and py37 (which is python 3.7
obviously) failing. I think verifying against the exact whitespace
is unnecessary anyhow. As long as it isn't egregious, we don't
really need to worry about exactly what the spacing is.
Zac Dover [Thu, 25 Aug 2022 15:56:41 +0000 (01:56 +1000)]
doc/mgr: add prompt directives to dashboard.rst
This commit adds prompt directives (.. prompt:: bash $) to
the commands in dashboard.rst.
There are several ".. include::" directives in the dashboard.rst
file, which means that part of this page is sourced from elsewhere
than the dashboard.rst file. Because I have not yet added prompt
directives to those files, there is an inconsistency in the rendering
of this file. Most of the commands on this page have unselectable
prompts (unselectable prompts are the prompts that don't get added to
the buffer when you copy them to one of the clipboards). But the
commands on this page that come from those ".. include::" directives
do not yet have unselectable prompts.
This file is over 1600 lines long. It was perhaps not optimally wise
of me to have edited all of it in one fell swoop. It took many hours,
and carefully checking it will probably take at least one hour. I
suggest that whoever reviews this should not spend much time on it,
but should instead make a quick pass over the page and make sure that
it looks passable.
The English syntax on this page (and throughout the Dashboard doc-
umentation) will be tightened to remove ambiguity and to improve
readability in the near future, so hold all English-language-related
comments for a future pull request.
John Mulligan [Thu, 25 Aug 2022 13:58:55 +0000 (09:58 -0400)]
pybind/mgr: tox.ini remove redundant `tox` env
Fixes: https://tracker.ceph.com/issues/57153
The envlist contained an environment named `lint`. There was no specific
customization of the lint testenv so it is essentially the same as
running the `py3` testenv.
This was probably a typo and was meant to be `pylint`. Unfortunately,
the pylint test env does not appear to work, probably because it was
never run as part of any automation. At the risk of leaving old stuff
behind I'm not removing the pylint testenv at the moment, only the
`lint` item in order to not run redundant tests.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Laura Flores [Wed, 24 Aug 2022 22:23:45 +0000 (22:23 +0000)]
src/pybind/mgr/telemetry: parse `outb` instead of `outs`
Following the merge of https://github.com/ceph/ceph/pull/47650, which
fixes the confusion between std out and std err in admin socket
commands, we will need to reference the out stream (outb) instead
of the error stream (outs) when we parse heap stats.
Adam King [Wed, 24 Aug 2022 14:36:53 +0000 (10:36 -0400)]
doc/cephadm: fix example for specifying networks for rgw
count_per_host must be used with underscores rather
than dashes to work, you need to pass service_id not
service_name and the option for the port is called
rgw_frontend_port not just "port"