Paul Cuzner [Wed, 2 Jun 2021 23:34:19 +0000 (11:34 +1200)]
mgr/cephadm:fix alerts sent to wrong URL
The path_prefix in prometheus.yml was specifying an
endpoint prefix, which was invalid. This resulted in 404
errors when trying to send alerts to alertmanager and
blocked alerts being sent on to the ceph-dashboard API
receiver. This fix remves this prefix.
Fixes: https://tracker.ceph.com/issues/51073 Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
(cherry picked from commit 9d408a70c7d01fd7c94f9b814af916396d7cbf1f)
mgr/cephadm: fix issue with missing prometheus alerts
Files passed as configuration to the cephadm binary had not been created
and mapped to the container, if those files weren't included in the
required files section inside cephadm. This prevented optional file
includes in the configuration.
The configuration file for the Prometheus default alerts is not
mandatory and hence wasn't included in the required files section, still it
needs to be added to the container by cephadm.
This change enables optional files to be included in the configuration
for monitoring components, so that those files are created and mapped
within the container.
Note that a `required_files` variable has been removed at one position
in these changes, though it wasn't used to ensure that required files
were included in the configuration at that point anyway. The test which
ensures that all required files are passed is somewhere else.
Deepika Upadhyay [Wed, 26 May 2021 09:11:55 +0000 (14:41 +0530)]
rados/cephadm/qa/distros: update to latest distros
- removes ubuntu_18.04 support for podman, instead we move to focal.
- use rhel_8.3 for all rhel_8 references
- use {centos/rhel}_8 instead of {rhel/centos}_latest: to keep things
same in master and octopus since we use: rhel_8 and centos_8 as latest
version symlinks, which differentiated after an octopus only commit.
this was not cherry picked from master as octopus had some of the
symlinks, not in sync with master, this commit does cleanup for them,
and tries to make them similar to master.
Sage Weil [Wed, 3 Mar 2021 14:14:29 +0000 (08:14 -0600)]
qa: new kubic distro files; use kubic podman for centos/rhel
The current centos/rhel version of podman (2.2.1) is broken.
- create new qa/distros/podman/* files that install kubic podman
- include centos/rhel variants
- adjust cephadm jobs to use new yaml files
- remove old qa/distros/all/*_podman.yaml files
trivial fix: we do not have cephadm/thrash suite in octopus(removed)
- distro(from octopus) renamed to 0-distro(from pacific)
Kefu Chai [Fri, 4 Jun 2021 03:25:12 +0000 (11:25 +0800)]
debian/control: ceph-mgr-modules-core does not Recommend ceph-mgr-rook anymore
per https://www.debian.org/doc/debian-policy/ch-relationships.html
> Recommends
> This declares a strong, but not absolute, dependency.
>
> The Recommends field should list packages that would be found together
> with this one in all but unusual installations.
ceph-mgr-modules-core provides a set of ceph-mgr modules which are
always enabeld. but the rook module enables ceph-mgr to install and
configure a Ceph cluster using Rook. this module is very useful but
it does not have such a strong connection with ceph-mgr-modules-core.
we can always install it separately for using better intergration with
Rook.
Cory Snyder [Fri, 28 May 2021 19:08:49 +0000 (15:08 -0400)]
mgr/DaemonServer.cc: prevent integer underflow that is triggered by large increases to pg_num/pgp_num
This fixes a scenario where mgrs continually crash while attempting to apply large increases to pg_num/pgp_num. The max step size (estmax) for each incremental update to the pgp_num is calculated as a percentage of the pg_num, which permits the possibility for the max step size (estmax) to be greater than the current pgp_num when the increase is large; this causes an integer underflow when the max step size is subtracted from the pgp_num in order to calculate the next step size with std::clamp. The integer underflow causes hi < lo in args passed to std::clamp, which causes a failed assertion, SIGABRT, and ultimately crashing mgr.
Igor Fedotov [Mon, 17 May 2021 19:23:26 +0000 (22:23 +0300)]
os/bluestore: fix unexpected ENOSPC in Avl/Hybrid allocators.
Avl allocator mode was returning unexpected ENOSPC in first-fit mode if all size-
matching available extents were unaligned but applying the alignment made all of
them shorter than required. Since no lookup retry with smaller size -
ENOSPC is returned.
Additionally we should proceed with a lookup in best-fit mode even when
original size has been truncated to match the avail size.
(force_range_size_alloc==true)
Fixes: https://tracker.ceph.com/issues/50656 Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit 0eed13a4969d02eeb23681519f2a23130e51ac59)
Deepika Upadhyay [Wed, 26 May 2021 19:18:38 +0000 (00:48 +0530)]
octopus: qa/upgrade: disable update_features test_notify with older client as lockowner
* with the recent support for async rbd operations from pacific+ when an
older client(non async support) goes on upgrade, and simultaneously
interacts with a newer client which expects the requests to be async,
experiences hang; considering the return code for request completion to
be acknowledgement for async request, which then keeps waiting for
another acknowledgement of request completion.
this if happens should be a rare only when lockowner is an old client
and should be deferred if compatibility issues arises.
* qa/upgrade: amend upgrade test workunits to use respective stable branches
max_misplaced with replaced by in target_max_misplaced_ratio edbd592ee44e02a5328e1510879555c2f9dcfc9e, but the document was not
sync'ed. let's update it accordingly.
In 7f047005fc72e1f37a45cde2d742bb2eb1e62881, we made the pg removal code
much more efficient. But it started marking the pgmeta object as an unexpected
onode, which in reality is expected to be removed after all the other objects.
This behavior is very easily reproducible in a vstart cluster:
ceph osd pool create test 1 1
rados -p test bench 10 write --no-cleanup
ceph osd pool delete test test --yes-i-really-really-mean-it
Before this patch:
"do_delete_work additional unexpected onode list (new onodes has appeared
since PG removal started[#2:00000000::::head#]" seen in the OSD logs.
After this patch:
"do_delete_work removing pgmeta object #2:00000000::::head#" is seen.
Related to:https://tracker.ceph.com/issues/50466 Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 0e917f1b1e18ca9e48b3f91110d3a46b086f7d83)
Kefu Chai [Tue, 25 May 2021 06:17:34 +0000 (14:17 +0800)]
mon/OSDMonitor: drop stale failure_info even if can_mark_down()
in a124ee85b03e15f4ea371358008ecac65f9f4e50, we add a check to drop
stale failure_info reports. but if osdmap does not prohibit us from
marking the osd in question down, the branch checking the stale info
is not executed. in general, it is allowed to mark an osd down, so
the fix of a124ee85b03e15f4ea371358008ecac65f9f4e50 just fails to
work.
in this change, we check for stale failure report of osd in question
as long as the osd is not marked down in the same function. this should
address the slow ops of failure report issue.
acting.size() >= pool.info.min_size is meant to check min_size against
acting set participants, but acting is a vector with placeholders.
actingset is the representation with placeholders removed.
The upshot of this bug is that the activation process will basically
ignore min_size for an ec pool allowing writes in cases where it
shouldn't. PastIntervals::check_new_interval, however, performs
the check correctly, and will therefore discount intervals in which
we really did serve writes as not writeable. This can trigger many
different problem conditions including but not limited to:
- Unfound objects due to accepting a last_update with insufficient
osds
- Lost writes
- Crashes due to peering rules being violated
This bug was originally introduced with recovery below min_size in e5a96fd, and then preserved through refactors in 749a13d and 95bec9.
7cb818a exposed it with with expansion of recovery below min_size
to include ec pools (acting.size() is sufficient for replicated
pools).
Fixes: https://tracker.ceph.com/issues/48613 Fixes: https://tracker.ceph.com/issues/48417 Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit 642a1c165499bcbd4cfdf907af313ac7ffe44ff4)
Conflicts:
src/osd/PeeringState.h
Fixes the callers rather than also backporting 95bec9873.
ignore BrokenPipeError which is thrown when piping the output of ceph
CLI to a tool which might close its stdin before ceph CLI sends the
whole help message.
Follow approach suggested by Kefu: https://github.com/python/cpython/commit/7b0ed43af55c1e2844aa0ccd5e088b2ddd38dbdb
This doesn't manage the clean-up/exit logic, as that's deferred to the
last part of the __main__ code.
Conflicts:
src/test/libcephfs/test.cc
- octopus is missing a bunch of tests, but this doesn't matter because the
commit being cherry-picked did not touch those
Jeff Layton [Mon, 1 Feb 2021 16:04:07 +0000 (11:04 -0500)]
client: add ceph_ll_lookup_vino
Add a new API function for looking up an inode via a vinodeno_t. This
should give ganesha a way to reliably look up snapshot inodes.
We do need to add some special handling for CEPH_SNAPDIRs. If we're
looking for one, then find the non-snapped parent, and then call
open_snapdir to get the snapdir inode.
Also, have the function check the local cache before calling the MDS
to look up an inode.
Fixes: https://tracker.ceph.com/issues/48991 Signed-off-by: Jeff Layton <jlayton@redhat.com>
(cherry picked from commit 70622079c2ec55222a139fa5042902e0b19bd839)
Jeff Layton [Mon, 1 Feb 2021 15:41:14 +0000 (10:41 -0500)]
client: make _lookup_ino take a vinodeno_t
Currently, it always leaves the snapid as 0. Rename it to
_lookup_vino and make it fill the snapid from the vinodeno_t
instead, but only when it's a "real" snapid.
Change the existing callers to pass in a vinodeno_t with the
snapid set to CEPH_NOSNAP.
Jeff Layton [Wed, 3 Feb 2021 13:12:41 +0000 (08:12 -0500)]
client: stop doing unnecessary work in ll_lookup_inode
It's not clear to me why we're looking up the parent and name of the
inode in ll_lookup_inode, as we don't actually do anything with them.
Just return once we get an inode reference.
luo rixin [Tue, 29 Dec 2020 06:39:21 +0000 (14:39 +0800)]
rgw/rgw_file: Fix the return value of read() and readlink()
Fixes: https://tracker.ceph.com/issues/49189 Signed-off-by: Dai zhiwei <daizhiwei3@huawei.com> Signed-off-by: luo rixin <luorixin@huawei.com>
(cherry picked from commit bfd83e8fa142873a0bdf09a4d1ad1b04127f5885)
rgw: read_obj_policy() consults iam_user_policies on ENOENT
when the head object doesn't exist, read_obj_policy() has to decide
whether to return ENOENT or EACCES
when there's a bucket policy, we check whether it has s3ListBucket
permissions. when there's an assumed role, we also need to check
against the role's policies in s->iam_user_policies
J. Eric Ivancich [Fri, 30 Apr 2021 20:07:54 +0000 (16:07 -0400)]
rgw: fix bucket object listing when marker matches prefix
When an iniitial marker that ends with a delimiter is provided, it
prevents listing of that "subdirectory" due to new logic at the cls
level to make listing more efficient. The fix catches that situation.
mgr/dashboard: allow getting fresh inventory data from the orchestrator
When there is a device change, a `ceph orch device ls --refresh` command
needs to be called so the orchestrator can invalidate its cache and
refresh all devices on all nodes. Currently, the call is asynchronous and
there is no way to determine is a refresh is done or not.
To allow doing a refresh in the Dashboard:
- The inventory device list is periodically updated with cached data.
- If the user clicks the refresh button, a refresh call is sent to the
orchestrator. Thus if there are device changes, it will be revealed soon
because of the periodical update.