Cephfs-top now contains two options 'm' for filesystem
selection and 'q' to go back.The home screen displays
the clients belonging to a particular filesystem as a group.
mgr/dashboard: Add details to the modal which displays the `safe-to-destroy` result
- Add warnings type information in the case of the OSDs are not safe to destroy
- Add info type information in the case of the OSDs are safe to destroy
Fixes: https://tracker.ceph.com/issues/37327 Signed-off-by: Francesco Torchia <francesco.torchia@suse.com>
(cherry picked from commit 0d6100bbf99ffa8da0e099343ede050f1cca509c)
Venky Shankar [Wed, 19 May 2021 05:27:12 +0000 (01:27 -0400)]
cephfs-top: switch to displaying average latencies and stdev
Do away with cumulative latencies -- those are not much useful.
However, these types need to be maintained since `perf stats`
command (via mgr/stats plugin) includes them. So, maintain a
legacy metrics list which is ignored when choosing metrics to
display.
mgr/stats: missing clients in perf stats command output.
perf stats doesn't get the client info w.r.t new filesystems
created or filesystems created on failing other filesystem
after running the perf stats command once with existing filesystems.
Xiubo Li [Mon, 15 Aug 2022 07:15:43 +0000 (15:15 +0800)]
client: abort the client if we couldn't invalidate dentry caches
The option 'client_die_on_failed_dentry_invalidate' requires to kill
the client when fails to invalidate the dentry caches from kernel.
The CephFS client requires a mechanism to invalidate dentries in the
caller (e.g. the kernel for ceph-fuse) when capabilities must be recalled.
If the client cannot do this then the MDS cache cannot shrink which
can cause the MDS to fail.
Xiubo Li [Mon, 15 Aug 2022 09:50:27 +0000 (17:50 +0800)]
client: stop the remount_finisher thread in the Client::unmount()
The ceph_fuse will unmount the client and then finalize the cfuse
and at the same will free the mountpoint memory. And at last will
try to stop the remount_finisher thread. But the remount_finisher
thread will use the freed mountpoint to do the remount, which will
case unexpected remount failures.
Just stop the remount_finisher thread in the Client::unmount().
osd, mds: fix the "heap" admin cmd printing always to error stream
Before the patch `ceph::osd_cmds::heap()` was confusing
the concepts of _stderr_ and _stdout_. This was the direct
cause of the differences in output between `ceph tell` and
`ceph daeamon`.
Thanks to Laura Flores who made the extremely useful observation
noted in https://tracker.ceph.com/issues/57119#note-3.
mgr/cephadm: Adding logic to store grafana cert/key per node Fixes: https://tracker.ceph.com/issues/56508 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 3c990f974e3beac0fc03f58c4c47f26f9d5afe56)
Adam King [Wed, 17 Aug 2022 20:54:54 +0000 (16:54 -0400)]
cephadm: return nonzero exit code when applying spec fails in bootstrap
This is mostly useful for testing automation, but right now if applying the
spec provided with --apply-spec fails, the return code remains zero. We don't
want to error out entirely in that case as we still want to print the remaining
output (e.g. the dashboard password). Continuing onward and then returning a
nonzero code could provide a balance where we still give all the output but
still have something to make it easier for those writing automation around bootstrap.
Adam King [Wed, 24 Aug 2022 19:13:15 +0000 (15:13 -0400)]
qa/cephadm: remove fsid dir before bootstrap in test_cephadm.sh
The shell commands we test beforehand can create the
/var/lib/ceph/00000000-0000-0000-0000-0000deadbeef directory
and that directory being present will block bootstrap as
it will think a cluster with this fsid alreayd exists
Adam King [Mon, 22 Aug 2022 15:14:12 +0000 (11:14 -0400)]
mgr/cephadm: allow setting prometheus retention time
When we deploy Prometheus server, we don't provide any
ability to define the tsdb retention time - so it defaults to 15d.
This change adds a field that can be passed in a prometheus service
spec that will be passed as an arg to the --storage.tsdb.retention.time
parameter for the prometheus daemon.
Paul Cuzner [Mon, 29 Aug 2022 23:54:00 +0000 (11:54 +1200)]
cephadm: Fix disk size calculation
With native 4k sectors, the logical blocksize is set to
4096, which yields a disk size 8x the size of the actual
device. According to kernel source, device size only
uses 512 byte sectors, so the use of logical blocksize
is unnecessary.
Fixes: https://tracker.ceph.com/issues/57335 Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
(cherry picked from commit a6f10ebd572cbf95c94614a94f981ca3550fca25)
John Mulligan [Mon, 29 Aug 2022 14:03:01 +0000 (10:03 -0400)]
qa/tasks/kubeadm: set up tigera resources via kubectl create
Fixes: https://tracker.ceph.com/issues/57268
The tigera operator for the calico CNI has some pretty large resource
definitions. The length of the definitions can cause the "client side
apply", the default mode for `kubectl apply ....`, to fail due to the
length of the needed annotation that would result:
```
2022-08-22T20:24:55.636 INFO:teuthology.orchestra.run.smithi087.stdout:clusterrolebinding.rbac.authorization.k8s.io/tigera-operator created
2022-08-22T20:24:55.670 INFO:teuthology.orchestra.run.smithi087.stdout:deployment.apps/tigera-operator created
2022-08-22T20:24:55.671 INFO:teuthology.orchestra.run.smithi087.stderr:The CustomResourceDefinition "installations.operator.tigera.io" is invalid: metadata.annotations: Too long: must have at most 262144 bytes
2022-08-22T20:24:55.674 DEBUG:teuthology.orchestra.run:got remote process result: 1
```
There are two simple options for avoiding this error. One is to use
`kubectl create`. The create command will not make this lengthy
annotation. It will fail if any of the resources already exist. The
other option is to use server-side apply, via the `kubectl apply
--server-side ...` command. It is new in k8s 1.18. It will not create
the annotation either.
The block of code setting up the CNI already uses `kubectl create` to
create the custom resources that configure the tigera operator.
Therefore it should be safe to assume the block of code in question
doesn't need to be idempotent and we can also use `kubectl create`
elsewhere in the same block.
Zac Dover [Fri, 12 Aug 2022 21:53:21 +0000 (07:53 +1000)]
doc/rados: add prompts to pools.rst
This commit adds ".. prompt:: bash $"-style prompts to pools.rst.
This brings this file up to the standard established in 2020 when
Kefu added support for the ".. prompt::" directive.
This commit is a part of an initiative to modernize the presentation
of all BASH commands in the RADOS documentation.
The progress of this project can be tracked here:
https://tracker.ceph.com/issues/57108
When generating tags the order of endpoints wasn't taken into account.
Two endpoints with the same url prefix, for example `/api/cluster/` and
`/api/cluster/user`, have different docs and the tags is generated from
a doc of one of these two, and since the order of these endpoints might
vary it is imperative to sort them to have a deterministic output.
test/{librbd, rgw}: increase delay between and number of bind attempts
Commit aa7885f7cc41 ("test/{librbd, rgw}: retry when bind fail with
port 0") reduced the frequency of sporadic unit test failures caused
by EADDRINUSE a lot, but not entirely.
Currently, it yields a cumulative sleep of ~9 seconds. Let's increase
that to 1 minute.
test/{librbd, rgw}: retry when bind fail with port 0
there is chance that the bind() call may fail if we have another test
happen to pick the free port picked by operating system. in this case,
we just retry up to 42 times.
in theory, this change does not fully address the racing, but it should
help to alleviate this issue.
With auto-deletion of trashed snapshots, it is relatively easy to lose
a race to "rbd flatten" as follows:
- when V2_GET_PARENT runs, the image is technically still a clone
- when V2_REFRESH_PARENT runs, the image is fully flattened and the
snapshot in the parent image is deleted
This results in a spurious ENOENT error, mainly when trying to open the
image (e.g. for "rbd info"). This race condition has always been there
but auto-deletion of trashed snapshots makes it much worse.
Retry ENOENT in V2_REFRESH_PARENT the same way as in V2_GET_SNAPSHOTS.
librbd: fix a bunch of issues with restarting RefreshRequest
Make RefreshRequest properly restartable, at least up until and including
V2_REFRESH_PARENT step:
- clear m_migration_spec when skipping GET_MIGRATION_HEADER
- don't rely on potentially stale m_incomplete_update on retry
- reset m_legacy_parent when retrying more than just V2_GET_PARENT
- don't rely on potentially stale m_parent_md.overlap and
m_head_parent_overlap on retry
- clear m_metadata before fetching image metadata (but not before
fetching pool metadata)
- clear m_op_features when skipping V2_GET_OP_FEATURES
- clear m_group_spec on EOPNOTSUPP error in V2_GET_GROUP
- reset m_legacy_snapshot when retrying more than just V2_GET_SNAPSHOTS
- don't rely on potentially stale m_snap_parents on retry