Venky Shankar [Wed, 19 May 2021 05:27:12 +0000 (01:27 -0400)]
cephfs-top: switch to displaying average latencies and stdev
Do away with cumulative latencies -- those are not much useful.
However, these types need to be maintained since `perf stats`
command (via mgr/stats plugin) includes them. So, maintain a
legacy metrics list which is ignored when choosing metrics to
display.
mgr/stats: missing clients in perf stats command output.
perf stats doesn't get the client info w.r.t new filesystems
created or filesystems created on failing other filesystem
after running the perf stats command once with existing filesystems.
mgr/cephadm: Adding logic to store grafana cert/key per node Fixes: https://tracker.ceph.com/issues/56508 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 3c990f974e3beac0fc03f58c4c47f26f9d5afe56)
Adam King [Wed, 17 Aug 2022 20:54:54 +0000 (16:54 -0400)]
cephadm: return nonzero exit code when applying spec fails in bootstrap
This is mostly useful for testing automation, but right now if applying the
spec provided with --apply-spec fails, the return code remains zero. We don't
want to error out entirely in that case as we still want to print the remaining
output (e.g. the dashboard password). Continuing onward and then returning a
nonzero code could provide a balance where we still give all the output but
still have something to make it easier for those writing automation around bootstrap.
Adam King [Mon, 22 Aug 2022 15:14:12 +0000 (11:14 -0400)]
mgr/cephadm: allow setting prometheus retention time
When we deploy Prometheus server, we don't provide any
ability to define the tsdb retention time - so it defaults to 15d.
This change adds a field that can be passed in a prometheus service
spec that will be passed as an arg to the --storage.tsdb.retention.time
parameter for the prometheus daemon.
Paul Cuzner [Mon, 29 Aug 2022 23:54:00 +0000 (11:54 +1200)]
cephadm: Fix disk size calculation
With native 4k sectors, the logical blocksize is set to
4096, which yields a disk size 8x the size of the actual
device. According to kernel source, device size only
uses 512 byte sectors, so the use of logical blocksize
is unnecessary.
Fixes: https://tracker.ceph.com/issues/57335 Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
(cherry picked from commit a6f10ebd572cbf95c94614a94f981ca3550fca25)
Zac Dover [Fri, 12 Aug 2022 21:53:21 +0000 (07:53 +1000)]
doc/rados: add prompts to pools.rst
This commit adds ".. prompt:: bash $"-style prompts to pools.rst.
This brings this file up to the standard established in 2020 when
Kefu added support for the ".. prompt::" directive.
This commit is a part of an initiative to modernize the presentation
of all BASH commands in the RADOS documentation.
The progress of this project can be tracked here:
https://tracker.ceph.com/issues/57108
When generating tags the order of endpoints wasn't taken into account.
Two endpoints with the same url prefix, for example `/api/cluster/` and
`/api/cluster/user`, have different docs and the tags is generated from
a doc of one of these two, and since the order of these endpoints might
vary it is imperative to sort them to have a deterministic output.
test/{librbd, rgw}: increase delay between and number of bind attempts
Commit aa7885f7cc41 ("test/{librbd, rgw}: retry when bind fail with
port 0") reduced the frequency of sporadic unit test failures caused
by EADDRINUSE a lot, but not entirely.
Currently, it yields a cumulative sleep of ~9 seconds. Let's increase
that to 1 minute.
test/{librbd, rgw}: retry when bind fail with port 0
there is chance that the bind() call may fail if we have another test
happen to pick the free port picked by operating system. in this case,
we just retry up to 42 times.
in theory, this change does not fully address the racing, but it should
help to alleviate this issue.
With auto-deletion of trashed snapshots, it is relatively easy to lose
a race to "rbd flatten" as follows:
- when V2_GET_PARENT runs, the image is technically still a clone
- when V2_REFRESH_PARENT runs, the image is fully flattened and the
snapshot in the parent image is deleted
This results in a spurious ENOENT error, mainly when trying to open the
image (e.g. for "rbd info"). This race condition has always been there
but auto-deletion of trashed snapshots makes it much worse.
Retry ENOENT in V2_REFRESH_PARENT the same way as in V2_GET_SNAPSHOTS.
librbd: fix a bunch of issues with restarting RefreshRequest
Make RefreshRequest properly restartable, at least up until and including
V2_REFRESH_PARENT step:
- clear m_migration_spec when skipping GET_MIGRATION_HEADER
- don't rely on potentially stale m_incomplete_update on retry
- reset m_legacy_parent when retrying more than just V2_GET_PARENT
- don't rely on potentially stale m_parent_md.overlap and
m_head_parent_overlap on retry
- clear m_metadata before fetching image metadata (but not before
fetching pool metadata)
- clear m_op_features when skipping V2_GET_OP_FEATURES
- clear m_group_spec on EOPNOTSUPP error in V2_GET_GROUP
- reset m_legacy_snapshot when retrying more than just V2_GET_SNAPSHOTS
- don't rely on potentially stale m_snap_parents on retry
qa: filter internal directories in 'subvolumegroup ls' command
Internal directories: '_nogroup', '_index', '_legacy', '_deleting'
1. Internal directories should be filtered in 'subvolmegroup ls' command.
2. Internal directories should not be accepted as a group name.
mgr/volumes: filter internal directories in 'subvolumegroup ls' command
Internal directories: '_nogroup', '_index', '_legacy', '_deleting'
1. Internal directories should be filtered in 'subvolmegroup ls' command.
2. Internal directories should not be accepted as a group name.
Tim Serong [Thu, 21 Jul 2022 05:55:19 +0000 (15:55 +1000)]
cephfs-shell: move source to separate subdirectory
This ensures the package discovery done by python setuptools >= 61
doesn't get confused when building cephfs-shell and cephfs-top.
This commit moves cephfs-shell to a separate "shell" subdirectory,
which is the same approach we've already got with the cephfs-top
tool being in a separate "top" subdirectory.
Tamar Shacked [Sun, 15 May 2022 08:39:22 +0000 (11:39 +0300)]
client: allow overwrites to files with size greater than the max_file_size cfg
Before this change, overwriting from file-offset >= max_file_size config
returns "File too large" (even though the data is being written)
This change allow overwrites as the file size is not further increasing.
Zac Dover [Tue, 30 Aug 2022 11:48:08 +0000 (21:48 +1000)]
doc/start: update documenting-ceph branch names
This PR updates the branch names in the
documenting-ceph.rst file. It gets rid of all references
to the "master" branch, and updates the language to
reflect the state of play in 2022.
inb4: This PR merely removes the most egregious inaccuracies,
the ones that were most readily evident on a cursory perusal.
The full text remains to be carefully read and fitted together
with care.
Lucian Petrut [Fri, 26 Aug 2022 12:54:10 +0000 (12:54 +0000)]
include: fix IS_ERR on Windows
The "long" type uses 32b on x64 Windows platforms, which means
it's not large enough to store a pointer. intptr_t or uintptr_t
should be used instead.
This change fixes include/err.h, using the right types. There was
a previous patch on this topic but unfortunately it didn't address
all the type casts.
This issue was brought up by the unittest_crush test, which recently
started to fail as the CrushWrapper methods use IS_ERR.
cmake: link denc-mod-rgw against Boost::filesystem
to address the runtime link failure.
this change is not cherry-picked from main branch. as, in main branch,
the Boost::filesystem linkage is pulled in by rgw_common, which was
changed to a static library in 43d10b9e44ca50700e9076a47f2c38b360d1d632.
but this change is not included in pacific. so rgw added the linkage
via rgw_libs CMake variable. unfortunately, the lexical scope of this
variable does not not include tools/ceph-dencoder/CMakeLists.txt, so
we have to add this linkage manually here.
Signed-off-by: Tim Serong <tserong@suse.com> Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Ilya Dryomov [Tue, 30 Aug 2022 09:45:44 +0000 (11:45 +0200)]
rbd-mirror: skip setting error code on snapshot replayer shutdown
This is regarding failures in unregister_remote_update_watcher() and
unregister_local_update_watcher(). handle_replay_complete() can't be
called in these cases anymore as it would blindly attempt to unregister
watchers from scratch again. Dropping handle_replay_complete() calls
there means that these failures would only be logged and would not be
surfaced by snapshot replayer. But the only caller ignores them
anyway:
void ImageReplayer<I>::shut_down(int r) {
...
// close the replayer
if (m_replayer != nullptr) {
ctx = new LambdaContext([this, ctx](int r) {
m_replayer->destroy();
m_replayer = nullptr;
ctx->complete(0); <------
});
ctx = new LambdaContext([this, ctx](int r) {
m_replayer->shut_down(ctx);
});
}
Ilya Dryomov [Wed, 24 Aug 2022 10:56:31 +0000 (12:56 +0200)]
rbd-mirror: resume pending shutdown on error in snapshot replayer
If a shutdown is requested, e.g. by update_pool_replayers() because
remote RADOS instance got blocklisted, and Replayer::shut_down() pends
it on completion of current snapshot sync, it gets stuck if replayer
encounters an error in the interim. This is particularly likely in the
blocklist case: a higher layer may detect that client got blocklisted
and request a shutdown first, and then when replayer sees EBLOCKLISTED
in turn, it calls handle_replay_complete() -- which does not resume
a pending shutdown. Because update_pool_replayers() blocks on shutdown
with Mirror::m_lock held, eventually the entire daemon hangs in
perpetuity.