Yuval Lifshitz [Wed, 23 Feb 2022 15:21:10 +0000 (17:21 +0200)]
rgw: prevent spurious/lost notifications in the index completion thread
this was happening when asyn completions happened during reshard.
more information about testing:
https://gist.github.com/yuvalif/d526c0a3a4c5b245b9e951a6c5a10517
we also add more logs to the completion manager.
should allow finding unhandled completions due to reshards.
Zac Dover [Thu, 24 Feb 2022 07:22:42 +0000 (17:22 +1000)]
doc/start: include A. D'Atri's hardware-recs recs
This PR restores material about partition alignment
and material about separating OS and OSD data that
was removed in an earlier rewrite. The restoration
of this information was requested by Anthony D'Atri in
https://github.com/ceph/ceph/pull/45123/
This PR also includes several refinements to the language
that could not be made to this text until now, owing to my
(Zac's) ignorance and illiteracy.
I call upon Mark Nelson (and anyone else with sufficient
command of the current state of storage technology) to advise
me on whether the Ceph Foundation feels comfortable in the year
2022 referring to QLC as an emerging technology.
Signed-off-by: Zac Dover <zac.dover@gmail.com>
(squash) more notes and revisions
John Mulligan [Wed, 2 Feb 2022 20:58:08 +0000 (15:58 -0500)]
doc/mgr/nfs: document that nfs exports related mgr call requirements
A recent change in the mgr/nfs module should enable the functioning
of export management commands/API calls as long as the rados namespaces
and objects have been already established. Document this fact, noting
that now only the `ceph nfs cluster ...` calls *require* an
orchestration module.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Sat, 29 Jan 2022 16:23:00 +0000 (11:23 -0500)]
mgr/nfs: support managing exports without orchestration enabled
This change allows the `ceph nfs export ...` commands to function
without the entire mgr/nfs subsystem requiring orchestration to be
enabled. When there's no orchestration available, the code falls back
to examining the namespaces in the ".nfs" rados pool to determine what
cluster_id values are valid.
This change does not add support for creating the rados objects and
namespace needed to manage a nfs cluster. As discussed with the
orchestration group on 2022-01-22, rook does not need the mgr module to
establish the namespace. So, for now, we'll defer the work needed to
create the namespace/objects when orchestration is disabled.
Fixes: https://tracker.ceph.com/issues/54043 Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 27 Jan 2022 22:06:02 +0000 (17:06 -0500)]
mgr/nfs: clean up rados object naming code
The naming of rados objects used to store the nfs config was spread
all over the code, including inline f-strings, not-static methods,
etc.
This change unifies the naming by putting constant string prefixes
and name generating functions into the utils.py file.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 27 Jan 2022 21:06:50 +0000 (16:06 -0500)]
mgr/nfs: make _check_rados_notify a function
This was previously a staticmethod. This static method was only used by
NFSRados object. Staticmethods are nearly always better implemented as
functions, which is done so here.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 27 Jan 2022 20:50:13 +0000 (15:50 -0500)]
mgr/nfs: limit dependency of NFSRados object
Previously, the NFSRados object accepted the "Module" as the
first argument but only used the rados attribute (type rados.Rados).
It's better to limit the scope of types when reasonably possible
so we can see what the true dependencies are. So we restrict
NFSRados to accepting a rados.Rados as the argument.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Yaarit Hatuka [Tue, 22 Feb 2022 19:22:09 +0000 (19:22 +0000)]
mgr/devicehealth: skip null pages when extracting wear level
Some devices have null pages in their ata_device_statistics struct; skip
those pages in order to avoid an AttributeError when extracting device's
wear level.
osd: Write non-zero data as part of osd benchmark test.
An optimization (see PR: https://github.com/ceph/ceph/pull/43337) was made
in BlueStore to avoid writing bufferlists made up of zeros. The osd
benchmark used zero filled bufferlists and this resulted in inflated osd
benchmark results.
This issue is fixed by using bufferlists filled with non-zero values.
Ilya Dryomov [Sun, 20 Feb 2022 16:33:08 +0000 (17:33 +0100)]
rbd-mirror: synchronize with in-flight stop in ImageReplayer::stop()
Complete on_finish right away only if the replayer is stopped (meaning
that it is legible to be restarted immediately, possibly from on_finish
itself). This is the behaviour pretty much anyone would assume and
also what ImageReplayer::restart() relies on.
Ilya Dryomov [Sun, 20 Feb 2022 12:11:02 +0000 (13:11 +0100)]
rbd-mirror: manual stop should take precedence over regular stop
Somewhat similar to commit 0a3794e56256 ("rbd-mirror: make stop
properly cancel restart"), make it so that a) if a manual stop is
joined to regular stop, the stop becomes manual and b) if a regular
stop is joined to a manual stop, the stop stays manual.
Ilya Dryomov [Sat, 19 Feb 2022 15:43:04 +0000 (16:43 +0100)]
rbd-mirror: straighten ImageReplayer::stop() a bit
- don't default on_finish parameter
- m_restart_requested is set in ImageReplayer::restart() which is the
only restart=true call site, so setting m_restart_requested here is
redundant
- is_stopped_() can't be true in is_running_() branch
- on_finish->complete(0) in the end is unreachable
myoungwon oh [Fri, 13 Aug 2021 08:15:54 +0000 (17:15 +0900)]
seastore: seperate Journal interface from SegmentedJournal implementation
A subsequent PR will introduce a CircularBoundedJournal implementation
for fast nvme devices.
SegmentCleaner no longer needs a reference to Journal, so dispense with
the set_segment_provider machinery and simply pass it in the
constructor.
Move responsibility for finding the journal segments into the journal
itself. This does mean that we check the segment headers on the journal
device twice, but that should be a neglible amount of overhead on mount.
SegmentCleaner::init_segments no longer needs to return Journal
segments, so merge with mount().
Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com> Signed-off-by: Samuel Just <sjust@redhat.com>
Mykola Golub [Fri, 18 Feb 2022 10:42:23 +0000 (10:42 +0000)]
rbd-mirror: make mirror properly detect pool replayer needs restart
When a PoolReplayer detects remote pool metadata change it
sets "stopping" flag expecting the Mirror will restart it.
Although setting "stopping" flag makes the PoolReplayer::run
thread to terminate, the thread's is_started function will still
return true until join is called (and reset the thread id).
This made impossible for the Mirror to detect (by calling
PoolReplayer::is_running) that the PoolReplayer needed restart.
Kefu Chai [Fri, 18 Feb 2022 15:23:54 +0000 (23:23 +0800)]
crimson: specialize fmt::formatter<>() for crimson types
otherwise we'd have FTBFS like
/usr/include/fmt/core.h:1727:3: error: static_assert failed due to requirement 'formattable' "Cannot format an argument. To make type T formattable provide a formatter<T> specialization: https://fmt.dev/latest/api$
static_assert(
^
/usr/include/fmt/core.h:1853:23: note: in instantiation of function template specialization 'fmt::detail::make_arg<true, fmt::basic_format_context<fmt::appender, char>, fmt::detail::type::custom_type, crimson::os:$
data_{detail::make_arg<
please note, delta_op_t is lifted out of the templated outer class
to avoid the headache of specialization of template of template in
another namespace.