John Mulligan [Wed, 2 Feb 2022 20:58:08 +0000 (15:58 -0500)]
doc/mgr/nfs: document that nfs exports related mgr call requirements
A recent change in the mgr/nfs module should enable the functioning
of export management commands/API calls as long as the rados namespaces
and objects have been already established. Document this fact, noting
that now only the `ceph nfs cluster ...` calls *require* an
orchestration module.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Sat, 29 Jan 2022 16:23:00 +0000 (11:23 -0500)]
mgr/nfs: support managing exports without orchestration enabled
This change allows the `ceph nfs export ...` commands to function
without the entire mgr/nfs subsystem requiring orchestration to be
enabled. When there's no orchestration available, the code falls back
to examining the namespaces in the ".nfs" rados pool to determine what
cluster_id values are valid.
This change does not add support for creating the rados objects and
namespace needed to manage a nfs cluster. As discussed with the
orchestration group on 2022-01-22, rook does not need the mgr module to
establish the namespace. So, for now, we'll defer the work needed to
create the namespace/objects when orchestration is disabled.
Fixes: https://tracker.ceph.com/issues/54043 Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 27 Jan 2022 22:06:02 +0000 (17:06 -0500)]
mgr/nfs: clean up rados object naming code
The naming of rados objects used to store the nfs config was spread
all over the code, including inline f-strings, not-static methods,
etc.
This change unifies the naming by putting constant string prefixes
and name generating functions into the utils.py file.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 27 Jan 2022 21:06:50 +0000 (16:06 -0500)]
mgr/nfs: make _check_rados_notify a function
This was previously a staticmethod. This static method was only used by
NFSRados object. Staticmethods are nearly always better implemented as
functions, which is done so here.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 27 Jan 2022 20:50:13 +0000 (15:50 -0500)]
mgr/nfs: limit dependency of NFSRados object
Previously, the NFSRados object accepted the "Module" as the
first argument but only used the rados attribute (type rados.Rados).
It's better to limit the scope of types when reasonably possible
so we can see what the true dependencies are. So we restrict
NFSRados to accepting a rados.Rados as the argument.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Yaarit Hatuka [Tue, 22 Feb 2022 19:22:09 +0000 (19:22 +0000)]
mgr/devicehealth: skip null pages when extracting wear level
Some devices have null pages in their ata_device_statistics struct; skip
those pages in order to avoid an AttributeError when extracting device's
wear level.
osd: Write non-zero data as part of osd benchmark test.
An optimization (see PR: https://github.com/ceph/ceph/pull/43337) was made
in BlueStore to avoid writing bufferlists made up of zeros. The osd
benchmark used zero filled bufferlists and this resulted in inflated osd
benchmark results.
This issue is fixed by using bufferlists filled with non-zero values.
Ilya Dryomov [Sun, 20 Feb 2022 16:33:08 +0000 (17:33 +0100)]
rbd-mirror: synchronize with in-flight stop in ImageReplayer::stop()
Complete on_finish right away only if the replayer is stopped (meaning
that it is legible to be restarted immediately, possibly from on_finish
itself). This is the behaviour pretty much anyone would assume and
also what ImageReplayer::restart() relies on.
Ilya Dryomov [Sun, 20 Feb 2022 12:11:02 +0000 (13:11 +0100)]
rbd-mirror: manual stop should take precedence over regular stop
Somewhat similar to commit 0a3794e56256 ("rbd-mirror: make stop
properly cancel restart"), make it so that a) if a manual stop is
joined to regular stop, the stop becomes manual and b) if a regular
stop is joined to a manual stop, the stop stays manual.
Ilya Dryomov [Sat, 19 Feb 2022 15:43:04 +0000 (16:43 +0100)]
rbd-mirror: straighten ImageReplayer::stop() a bit
- don't default on_finish parameter
- m_restart_requested is set in ImageReplayer::restart() which is the
only restart=true call site, so setting m_restart_requested here is
redundant
- is_stopped_() can't be true in is_running_() branch
- on_finish->complete(0) in the end is unreachable
myoungwon oh [Fri, 13 Aug 2021 08:15:54 +0000 (17:15 +0900)]
seastore: seperate Journal interface from SegmentedJournal implementation
A subsequent PR will introduce a CircularBoundedJournal implementation
for fast nvme devices.
SegmentCleaner no longer needs a reference to Journal, so dispense with
the set_segment_provider machinery and simply pass it in the
constructor.
Move responsibility for finding the journal segments into the journal
itself. This does mean that we check the segment headers on the journal
device twice, but that should be a neglible amount of overhead on mount.
SegmentCleaner::init_segments no longer needs to return Journal
segments, so merge with mount().
Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com> Signed-off-by: Samuel Just <sjust@redhat.com>
Mykola Golub [Fri, 18 Feb 2022 10:42:23 +0000 (10:42 +0000)]
rbd-mirror: make mirror properly detect pool replayer needs restart
When a PoolReplayer detects remote pool metadata change it
sets "stopping" flag expecting the Mirror will restart it.
Although setting "stopping" flag makes the PoolReplayer::run
thread to terminate, the thread's is_started function will still
return true until join is called (and reset the thread id).
This made impossible for the Mirror to detect (by calling
PoolReplayer::is_running) that the PoolReplayer needed restart.
Kefu Chai [Fri, 18 Feb 2022 15:23:54 +0000 (23:23 +0800)]
crimson: specialize fmt::formatter<>() for crimson types
otherwise we'd have FTBFS like
/usr/include/fmt/core.h:1727:3: error: static_assert failed due to requirement 'formattable' "Cannot format an argument. To make type T formattable provide a formatter<T> specialization: https://fmt.dev/latest/api$
static_assert(
^
/usr/include/fmt/core.h:1853:23: note: in instantiation of function template specialization 'fmt::detail::make_arg<true, fmt::basic_format_context<fmt::appender, char>, fmt::detail::type::custom_type, crimson::os:$
data_{detail::make_arg<
please note, delta_op_t is lifted out of the templated outer class
to avoid the headache of specialization of template of template in
another namespace.
Casey Bodley [Tue, 15 Feb 2022 23:12:00 +0000 (18:12 -0500)]
rbd: avoid get_callback_adapter() for tcp_stream::async_connect()
works around a compilation failure in c++20 (with gcc 11.2 and boost
1.76) when choosing between two overloads of
`boost::beast::tcp_stream::async_connect()`
`get_callback_adapter()` returns a variadic lambda that matches the
concept for both overloads (`completion_token_for<ConnectHandler>`
and `completion_token_for<RangeConnectHandler>`), but compilation of
the wrapped lambda fails for the `ConnectHandler` overload because it
expects two arguments instead of one
instead of using `get_callback_adapter()` to convert the first argument
from `boost::system::error_code` to `int` for the wrapped lambda, do
this in the lambda itself
Laura Flores [Fri, 11 Feb 2022 19:37:26 +0000 (19:37 +0000)]
mgr/telemetry: handle empty device report when "send" is triggered
On certain environments, such as the "ceph-dev-docker" environment
(https://github.com/ricardoasmarques/ceph-dev-docker), the mgr
module is unable to fetch device metrics. As a result, the device
report generated by "gather_device_report()" returns an empty dict.
This causes an AssertionError when the "send" function is triggered
(i.e. by running `ceph telemetry status` or `ceph telemetry send`),
and the module crashes.
The fix in this commit checks that the generated device report
contains metrics before trying to send it. If the device report
does not contain metrics (it returns an empty dict), the module
will log an appropriate message in the mgr log and not send the
device report.
If this scenario happens when running the `ceph telemetry send` command,
the user will additionally see this message:
```
Ceph report sent to https://telemetry.ceph.com/report
Unable to send device report: channel is on, but generated report was empty.
```
I also added a few more debug messages in gather_device_report() to make
future debugging easier.
Fixes: https://tracker.ceph.com/issues/54250 Signed-off-by: Laura Flores <lflores@redhat.com>