instead of using the default / existing distro, specify the distro for
testing. this change prevents us from using bionic for testing the
builds. this has two consequences:
* so we are one step closer to a non-bionic world.
* avoid building packages with PPA repo which *might* introduce runtime
dependencies on 3rd party runtimes provided by PPA repo.
Sage Weil [Mon, 12 Apr 2021 15:45:50 +0000 (11:45 -0400)]
Merge PR #40736 into master
* refs/pull/40736/head:
mgr/cephadm: rewrite/simplify describe_service
mgr/orchestrator: report osds as osd.unmanaged as appropriate
mgr/orchestrator: remove IMAGE ID from 'orch ls'
all the scripts except for test_cls_cas.sh under qa/workunits/cls
are executable. to be more consistent, add the executable bit to
test_cls_cas.sh as well.
also, these scripts are launched by src/script/gen-corpus.sh directly,
so it's convenient just call them.
if we happen to run this script on a host where /etc/ceph/ceph.conf is
available, ceph CLI would use it instead. so, point it to $PWD/ceph.conf
instead.
Changes to the socket code now result in returning EINVAL
In the past ENOENT was returned which is the FreeBSD error code
if DNS lookup does not work.
And that change is probably because somewhere in the code that
errorcode is not passed verbatim from the systemcall, but is
rewritten in extra evaluation.
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
FreeBSD ceph-dencoder crashes in the exit() calls, due to
invalid pointer references during the release process of
the loaded libraries.
Often this is signaled by libc reporting:
__cxa_thread_call_dtors: dtr 0x47efc0 from unloaded dso, skipping
The cause for this is different behaviour between FreeBSD and Linux:
https://groups.google.com/g/bsdmailinglist/c/22ncTZAbDp4/m/Dii_pII5AwAJ
_The FreeBSD implementation here looks racy. If one thread dlcloses an
object while another thread is exiting, we can end up calling a
function at an invalid memory address. It also looks as if it may
be possible to unload one library, load another at the same address,
and end up executing entirely the wrong code, which would have some
serious security implications.
The GNU/Linux equivalent of this function locks the DSO in memory
until all references to it have gone away. A call to dlclose() on
GNU/Linux will not actually unload the library until all threads
with destructors in that library have been unloaded. I believe
that this reuses the same reference counting mechanism that
allows the same library to be dlopened and dlclosed multiple times.
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
Sage Weil [Fri, 9 Apr 2021 20:26:00 +0000 (16:26 -0400)]
mgr/cephadm: rewrite/simplify describe_service
The prior implementation first tried to fabricate services based on the
running daemons, and then filled in defined services on top. This led
to duplication and a range of small errors.
Instead, flip this around: start with the services that are defined,
and only fill in 'unmanaged' services where we need to.
Drop the osd kludges and instead rely on DaemonDescription.service_id to
return the right thing.
rgw: test `radosgw-admin radoslist` and incomplete multiparts better
Make sure there are more than 1000 incomplete multiparts and also make
sure one of the incomplete multiparts has at least 1000 parts. This
test is done indirectly through rgw-orphan-list, which invokes
`radosgw-admin radoslist`.
Also, clean up shell flags, so script output is less verbose.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
Sage Weil [Sat, 3 Apr 2021 13:14:00 +0000 (09:14 -0400)]
cephadm: normalize unqualified repo digests to docker.io
A RepoDigests returned by docker|podman image inspect can either include
the docker.io/ prefix or not. For reasons that aren't entirely clear,
this may vary between hosts in a cluster. However, ceph/ceph@sha256:abc...
is the same thing as docker.io/ceph/ceph@sha256:abc..., and should be
treated as such. Otherwise, upgrade can get into a loop where it pulls
the image on a new host, finds the other variant of the repodigests,
sees no overlap, updates target_digests, and restarts. (It will then
find the first variant again on the first host and loop.)
Avoid this by normalizing any docker.io digests by always including the
docker.io/ prefix.
Note that it is technically possible that this assumption is wrong: it
may be that the image that already exists on the local host is from a
different registry in registries.conf's unqualified-search-registries.
However, we don't know which, since this is a search list. In practice,
it should be exceeding rare that an image that *we* are installing using
a fully-qualified image name will end up having an unqualified repodigest
in the local registry. Hopefully!
Fixes: https://tracker.ceph.com/issues/50114 Signed-off-by: Sage Weil <sage@newdream.net>
If we get an unqualified target image, assume it's docker.io. This
ensures that we're passing a fully-qualified target to docker|podman on
the various hosts and don't end up with something different based on the
per-host search path for unqualified image names.
This is my second attempt to rewrite the
second half of the mclock docs. The first attempt
is enshrined in https://github.com/ceph/ceph/pull/40571,
in which I got cute with git and got burned.
Sage Weil [Fri, 9 Apr 2021 19:35:17 +0000 (15:35 -0400)]
mgr/orchestrator: remove IMAGE ID from 'orch ls'
This is not very useful at this level:
- we see it from 'orch ps'
- it can be a mix of ids during upgrade
- some services may have multiple images at steady state (e.g., ingress)
When an incomplete multipart upload has in excess of 1000 parts,
looping over those parts was not handled property causing an infinite
loop. The paging/marker is now handled correctly.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
as FreeBSD also has msghdr but it does not have systemd, or
flags like MFD_ALLOW_SEALING, O_TMPFILE or F_SEAL_GROW. so
use WITH_SYSTEMD for enabling journald backend of logging system.
also move the option of "WITH_SYSTEMD" up so that the src/CMakeLists.txt
is able to see the variable of WITH_SYSTEMD defined by it.
myoungwon oh [Wed, 31 Mar 2021 06:22:01 +0000 (15:22 +0900)]
osd: remove unnecessary ref handling in _delete_oid
Let's consider the following case when handling a delete op.
1. Delete --> whiteouted
2. Make clone
In this case, current code clears chunk_map and calls dec_all_manifest_refcount()
in _delete_oid() even if the clone still has the references.
To fix this, This commit remove unnecessary ref handling in _delete_oid, and
makes finish_ctx() to handle ref handling, which can aware of whether the
clone is created or not.
Also, remove oi.size == 0 condition in finish_ctx() to handle ref. counting
upon a delete op with whitedouted clone.
osd: recover unreadable snapshot before reading ref. count info
Manifest objects needs adjacent clones when incrementing/decrementing
refcount. This commit makes the current code to call get_manifest_ref_count
before reading ref. count info.
Paul Cuzner [Tue, 2 Feb 2021 01:20:30 +0000 (14:20 +1300)]
mgr/prometheus:Improve the pool metadata
Adds percent_used and used_bytes to the per pool
metrics group, and add additional labels to the ceph_pool_metadata
metric
- compression_mode - so consumers can tell whether compression
is active
- description - provide a text string showing the protection
scheme of the pool (e.g. replica2, or ec:4+2
- type - text string showing replicated or erasure enabling
easy filtering across the pools
These additional fields allow compression savings to be more
easily shown. The inclusion of percent_used ensures that any
prometheus based view of pool usage will ALWAYS match the CLI.
Fixes: https://tracker.ceph.com/issues/49049 Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
This PR includes 2 things:
1. Changing force-branch to master and removing the git-remote. This change was forgetten for PR #39139.
2. Proper cleanup/removal after completion of commands more precisely removing the kafka logs directory.