Laura Flores [Tue, 17 Dec 2024 23:18:11 +0000 (17:18 -0600)]
PendingReleaseNotes: add note about tracker #69012
We merged a fix for v19.2.1 that helps alleviate
the worst of this problem (https://tracker.ceph.com/issues/68657),
but it still comes up on occasion. This release note addresses the
remaining issues tracked in https://tracker.ceph.com/issues/69012.
Oshrey Avraham [Wed, 18 Dec 2024 14:23:40 +0000 (16:23 +0200)]
rgw/notifications: Add tests for RGWPSListTopicsOp::execute()
Tests:
Add comprehensive test cases to verify the behavior of `RGWPSListTopicsOp::execute()` under various scenarios:
Migration case: Validate correct handling when `support_all_zones` is enabled, with v1 in a new state after migration and v2 topics present.
v2 notification case: Ensure proper retrieval when v2 notifications are supported.
v1 notification case: Verify fallback behavior when v2 notifications are unavailable.
Enhancements:
Update `delete_all_topics` to handle v1 responses with the `result` key.
Zac Dover [Wed, 18 Dec 2024 09:25:00 +0000 (19:25 +1000)]
doc/radosgw: edit uadk-accel.rst
Incorporate Anthony D'Atri's suggested changes from
https://github.com/ceph/ceph/pull/60953 into doc/radosgw/uadk-accel.rst.
Two questions from that PR remain unclear to me: one is about whether
IOMMU should be disabled for performance on AMD EPYC systems, and the
other is about UADK. The note about UADK will be rewritten in improved
English in a near-future PR and any remaining technical questions that
involve it can be discussed in that PR.
A bogus change introduced as part of PR#54363 (commit fbb7d73) changed multiple 'scrub' commands to 'scheduled-scrub'.
In this one instance - that was wrong.
Aashish Sharma [Fri, 6 Dec 2024 09:57:25 +0000 (15:27 +0530)]
mgr/dashboard: Fix Latency chart data units in rgw overview page
Issue: The Latency chart in the rgw overview page shows incorrect data
unit as the unit that we are passing as an input to the dashboard-area
chart component is `ms` whereas the data that we get from the metrics is
in seconds. Due to this if we pass a value like 0.725s to
the dashboard chart component it will show the value in the chart as
0.725ms whereas it should be 725ms.
Fix: Pass the value in ms as expected to the dashboard area chart
component
Casey Bodley [Mon, 16 Dec 2024 22:10:04 +0000 (17:10 -0500)]
rgw/posix: std::ignore return value of write()
/ceph/src/rgw/driver/posix/notify.h: In member function 'void file::listing::Inotify::signal_shutdown()':
/ceph/src/rgw/driver/posix/notify.h:215:19: error: ignoring return value of 'ssize_t write(int, const void*, size_t)' declared with attribute 'warn_unused_result' [-Werror=unused-result]
215 | (void) write(efd, &msg, sizeof(uint64_t));
| ~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors
cephadm/nvmeof: support per-node gateway addresses
Added gateway and discovery address maps to the service specification.
These maps store per-node service addresses. The address is first searched
in the map, then in the spec address configuration. If neither is defined,
the host IP is used as a fallback.
Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
Naman Munet [Fri, 22 Nov 2024 09:57:44 +0000 (15:27 +0530)]
mgr/dashboard: Administration > Configuration > Some of the config options are not updatable at runtime
Fixes: https://tracker.ceph.com/issues/68976
Fixes Includes:
1) by-passing 'can_update_at_runtime' flag for 'rgw' related configurations as the same can be updated at runtime via CLI.
Also implemented a warning popup for user to make force edit to rgw related configurations.
2) when navigated to Administration >> Configuration, modified configuration will be seen as we see in cli "ceph config dump",
instead of configuration with filter level:basic
Anoop C S [Wed, 11 Sep 2024 10:38:12 +0000 (16:08 +0530)]
cephfs: Add a pkgconfig file for libcephfs
It would really help external consumers to collectively figure out all
required parameters to link, build or load with libcephfs from a single
source of truth.
Samuel Just [Fri, 13 Dec 2024 01:37:28 +0000 (17:37 -0800)]
doc/dev/crimson/pipeline.rst: simplify and update to reflect new stages
This commit updates pipeline.rst to include some basic information about
how the pipeline stages now work. I've removed the explicit listing of
the different stages as I'd rather readers refer to the actual
implementation for those details to avoid them getting out of date.
I also removed the comparison to classic as the approach has now diverged
quite a bit and I feel that the ordering part is more important to focus
on than the points at which processing might block.
Samuel Just [Wed, 27 Nov 2024 02:22:16 +0000 (18:22 -0800)]
crimson: introduce and use repop stage
Repops previously used PGPipeline::await_map. This is actually
important as we need them to be processed in order. However, using
await_map was confusing and using a single exclusive stage is decidedly
unoptimal as we could allow pipelineing on write commit. For now, move
them over to their own pipeline stage so we can remove the PGPipeline
struct entirely. Later, we'll improve replica write pipelining for
better replica-side write concurrency.
We want to emplace and initialize osd_op_params upon first write,
but we don't want to fill at_version, pg_trim_to, pg_committed_to,
or last_complete until prepare_transaction because we don't want to
require a particular commit order any earlier than we have to.
That the log entry's verison matches the object_info on the actual
object is a pretty core invariant. This commit moves creating the
log entry for head and populating the metadata into
OpsExecuter::prepare_head_update.
As a side effect, flush_clone_metadata and CloningCtx::apply_to
were removed and split between prepare_head_update (portions
related to the head's ssc) and flush_changes_and_submit.
Samuel Just [Thu, 24 Oct 2024 23:33:56 +0000 (16:33 -0700)]
crimson: expose CommonOBCPipeline via ObjectContextLoader::Orderer
- adds ObjectContext::obc_pipeline
- exposes ObjectContext::obc_pipeline via ObjectContextLoader::Orderer
- allows obcs to be in the registry without being loaded
- adds ObjectContext::loading bool to signal that loading has begun
The HeadBucket API now reports the `X-RGW-Bytes-Used` and
`X-RGW-Object-Count` headers only when the `read-stats` querystring
is explicitly included in the API request.
Zac Dover [Fri, 13 Dec 2024 06:12:49 +0000 (16:12 +1000)]
doc/cephfs: edit 3rd 3rd of mount-using-kernel-driver
Edit the third third of doc/cephfs/mount-using-kernel-driver.rst in
preparation for correcting mount commands that may not work in Reef as
described in this documentation.
This commit edits only English-language strings in
doc/cephfs/mount-using-kernel-driver.rst. No technical content (that is,
no commands and no settings) have been altered in this commit.
Technical alterations to this file will be made only after the English
is unambiguous.
This PR follows the following two PRs:
https://github.com/ceph/ceph/pull/61048 - 1st 3rd
https://github.com/ceph/ceph/pull/61049 - 2nd 3rd
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
Ilya Dryomov [Thu, 12 Dec 2024 20:32:39 +0000 (21:32 +0100)]
librbd/migration/HttpClient: socket isn't shut down on some state transitions
If shut_down() gets delayed until a) the state transition from
STATE_RESET_CONNECTING completes and the reconnect is unsuccessful or
b) the state transition from STATE_RESET_DISCONNECTING completes (i.e.
next_state is STATE_UNINITIALIZED or STATE_RESET_CONNECTING), the
socket needs to be shut down before m_on_shutdown is invoked. The line
of thought here is the same as for the corresponding state transitions
that don't involve STATE_SHUTTING_DOWN.
Ilya Dryomov [Wed, 11 Dec 2024 15:25:13 +0000 (16:25 +0100)]
librbd/migration/HttpClient: avoid hitting an assert in advance_state()
If the shutdown gets delayed until the state transition from
STATE_RESET_CONNECTING completes and the reconnect is successful
(i.e. next_state is STATE_READY), we eventually hit "unexpected
state transition" assert in advance_state(). The reason is that
advance_state() would update m_state and call disconnect() under
STATE_READY instead of STATE_SHUTTING_DOWN. After the disconnect
maybe_finalize_shutdown() would enter advance_state() again with
STATE_SHUTDOWN as next_state, but the transition to that from
STATE_READY is invalid.
Plug this by not transitioning to next_state if current_state is
STATE_SHUTTING_DOWN.
Ilya Dryomov [Mon, 9 Dec 2024 10:19:57 +0000 (11:19 +0100)]
librbd/migration/HttpClient: ignore stream_truncated when shutting down SSL
Propagate ec to handle_disconnect() and use it to suppress
stream_truncated errors. Here is a quote from Beast documentation [1]:
// Gracefully shutdown the SSL/TLS connection
error_code ec;
stream.shutdown(ec);
// Non-compliant servers don't participate in the SSL/TLS shutdown process and
// close the underlying transport layer. This causes the shutdown operation to
// complete with a `stream_truncated` error. One might decide not to log such
// errors as there are many non-compliant servers in the wild.
if(ec != net::ssl::error::stream_truncated)
log(ec);
... and a commit that made ignoring stream_truncated safe [2]:
// ssl::error::stream_truncated, also known as an SSL "short read",
// indicates the peer closed the connection without performing the
// required closing handshake
// [...]
// When a short read would cut off the end of an HTTP message,
// Beast returns the error beast::http::error::partial_message.
// Therefore, if we see a short read here, it has occurred
// after the message has been completed, so it is safe to ignore it.
Ilya Dryomov [Sat, 7 Dec 2024 12:52:41 +0000 (13:52 +0100)]
librbd/migration/HttpClient: drop SslHttpSession::m_ssl_enabled
The remaining callers of disconnect() call it only when m_ssl_enabled
is set to true (i.e. after the handshake is completed):
- shut_down(), in STATE_READY
- maybe_finalize_reset(), very shortly after transitioning out of
STATE_READY as part of performing a reset
- advance_state(), on a transition to STATE_READY that is intercepted
by a previously delayed shut down
m_ssl_enabled isn't used outside of disconnect() and on top of that
is never cleared.