Cory Snyder [Fri, 28 May 2021 19:08:49 +0000 (15:08 -0400)]
mgr/DaemonServer.cc: prevent integer underflow that is triggered by large increases to pg_num/pgp_num
This fixes a scenario where mgrs continually crash while attempting to apply large increases to pg_num/pgp_num. The max step size (estmax) for each incremental update to the pgp_num is calculated as a percentage of the pg_num, which permits the possibility for the max step size (estmax) to be greater than the current pgp_num when the increase is large; this causes an integer underflow when the max step size is subtracted from the pgp_num in order to calculate the next step size with std::clamp. The integer underflow causes hi < lo in args passed to std::clamp, which causes a failed assertion, SIGABRT, and ultimately crashing mgr.
Igor Fedotov [Mon, 17 May 2021 19:23:26 +0000 (22:23 +0300)]
os/bluestore: fix unexpected ENOSPC in Avl/Hybrid allocators.
Avl allocator mode was returning unexpected ENOSPC in first-fit mode if all size-
matching available extents were unaligned but applying the alignment made all of
them shorter than required. Since no lookup retry with smaller size -
ENOSPC is returned.
Additionally we should proceed with a lookup in best-fit mode even when
original size has been truncated to match the avail size.
(force_range_size_alloc==true)
Fixes: https://tracker.ceph.com/issues/50656 Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit 0eed13a4969d02eeb23681519f2a23130e51ac59)
Deepika Upadhyay [Wed, 26 May 2021 19:18:38 +0000 (00:48 +0530)]
octopus: qa/upgrade: disable update_features test_notify with older client as lockowner
* with the recent support for async rbd operations from pacific+ when an
older client(non async support) goes on upgrade, and simultaneously
interacts with a newer client which expects the requests to be async,
experiences hang; considering the return code for request completion to
be acknowledgement for async request, which then keeps waiting for
another acknowledgement of request completion.
this if happens should be a rare only when lockowner is an old client
and should be deferred if compatibility issues arises.
* qa/upgrade: amend upgrade test workunits to use respective stable branches
max_misplaced with replaced by in target_max_misplaced_ratio edbd592ee44e02a5328e1510879555c2f9dcfc9e, but the document was not
sync'ed. let's update it accordingly.
In 7f047005fc72e1f37a45cde2d742bb2eb1e62881, we made the pg removal code
much more efficient. But it started marking the pgmeta object as an unexpected
onode, which in reality is expected to be removed after all the other objects.
This behavior is very easily reproducible in a vstart cluster:
ceph osd pool create test 1 1
rados -p test bench 10 write --no-cleanup
ceph osd pool delete test test --yes-i-really-really-mean-it
Before this patch:
"do_delete_work additional unexpected onode list (new onodes has appeared
since PG removal started[#2:00000000::::head#]" seen in the OSD logs.
After this patch:
"do_delete_work removing pgmeta object #2:00000000::::head#" is seen.
Related to:https://tracker.ceph.com/issues/50466 Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 0e917f1b1e18ca9e48b3f91110d3a46b086f7d83)
acting.size() >= pool.info.min_size is meant to check min_size against
acting set participants, but acting is a vector with placeholders.
actingset is the representation with placeholders removed.
The upshot of this bug is that the activation process will basically
ignore min_size for an ec pool allowing writes in cases where it
shouldn't. PastIntervals::check_new_interval, however, performs
the check correctly, and will therefore discount intervals in which
we really did serve writes as not writeable. This can trigger many
different problem conditions including but not limited to:
- Unfound objects due to accepting a last_update with insufficient
osds
- Lost writes
- Crashes due to peering rules being violated
This bug was originally introduced with recovery below min_size in e5a96fd, and then preserved through refactors in 749a13d and 95bec9.
7cb818a exposed it with with expansion of recovery below min_size
to include ec pools (acting.size() is sufficient for replicated
pools).
Fixes: https://tracker.ceph.com/issues/48613 Fixes: https://tracker.ceph.com/issues/48417 Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit 642a1c165499bcbd4cfdf907af313ac7ffe44ff4)
Conflicts:
src/osd/PeeringState.h
Fixes the callers rather than also backporting 95bec9873.
ignore BrokenPipeError which is thrown when piping the output of ceph
CLI to a tool which might close its stdin before ceph CLI sends the
whole help message.
Follow approach suggested by Kefu: https://github.com/python/cpython/commit/7b0ed43af55c1e2844aa0ccd5e088b2ddd38dbdb
This doesn't manage the clean-up/exit logic, as that's deferred to the
last part of the __main__ code.
Conflicts:
src/test/libcephfs/test.cc
- octopus is missing a bunch of tests, but this doesn't matter because the
commit being cherry-picked did not touch those
Jeff Layton [Mon, 1 Feb 2021 16:04:07 +0000 (11:04 -0500)]
client: add ceph_ll_lookup_vino
Add a new API function for looking up an inode via a vinodeno_t. This
should give ganesha a way to reliably look up snapshot inodes.
We do need to add some special handling for CEPH_SNAPDIRs. If we're
looking for one, then find the non-snapped parent, and then call
open_snapdir to get the snapdir inode.
Also, have the function check the local cache before calling the MDS
to look up an inode.
Fixes: https://tracker.ceph.com/issues/48991 Signed-off-by: Jeff Layton <jlayton@redhat.com>
(cherry picked from commit 70622079c2ec55222a139fa5042902e0b19bd839)
Jeff Layton [Mon, 1 Feb 2021 15:41:14 +0000 (10:41 -0500)]
client: make _lookup_ino take a vinodeno_t
Currently, it always leaves the snapid as 0. Rename it to
_lookup_vino and make it fill the snapid from the vinodeno_t
instead, but only when it's a "real" snapid.
Change the existing callers to pass in a vinodeno_t with the
snapid set to CEPH_NOSNAP.
Jeff Layton [Wed, 3 Feb 2021 13:12:41 +0000 (08:12 -0500)]
client: stop doing unnecessary work in ll_lookup_inode
It's not clear to me why we're looking up the parent and name of the
inode in ll_lookup_inode, as we don't actually do anything with them.
Just return once we get an inode reference.
luo rixin [Tue, 29 Dec 2020 06:39:21 +0000 (14:39 +0800)]
rgw/rgw_file: Fix the return value of read() and readlink()
Fixes: https://tracker.ceph.com/issues/49189 Signed-off-by: Dai zhiwei <daizhiwei3@huawei.com> Signed-off-by: luo rixin <luorixin@huawei.com>
(cherry picked from commit bfd83e8fa142873a0bdf09a4d1ad1b04127f5885)
rgw: read_obj_policy() consults iam_user_policies on ENOENT
when the head object doesn't exist, read_obj_policy() has to decide
whether to return ENOENT or EACCES
when there's a bucket policy, we check whether it has s3ListBucket
permissions. when there's an assumed role, we also need to check
against the role's policies in s->iam_user_policies
J. Eric Ivancich [Fri, 30 Apr 2021 20:07:54 +0000 (16:07 -0400)]
rgw: fix bucket object listing when marker matches prefix
When an iniitial marker that ends with a delimiter is provided, it
prevents listing of that "subdirectory" due to new logic at the cls
level to make listing more efficient. The fix catches that situation.
mgr/dashboard: allow getting fresh inventory data from the orchestrator
When there is a device change, a `ceph orch device ls --refresh` command
needs to be called so the orchestrator can invalidate its cache and
refresh all devices on all nodes. Currently, the call is asynchronous and
there is no way to determine is a refresh is done or not.
To allow doing a refresh in the Dashboard:
- The inventory device list is periodically updated with cached data.
- If the user clicks the refresh button, a refresh call is sent to the
orchestrator. Thus if there are device changes, it will be revealed soon
because of the periodical update.
Mykola Golub [Tue, 11 May 2021 06:53:08 +0000 (07:53 +0100)]
osd: don't assert in-flight backfill is always in recovery list
In PrimaryLogPG::on_failed_pull, we unconditionally remove soid
from recovering list, but remove it from backfills_in_flight only
when the backfill source is the primary osd.
Kefu Chai [Fri, 16 Oct 2020 17:10:24 +0000 (01:10 +0800)]
pybind/mgr/dashboard: use setUpClass for initializeing class
instead of relying on __init__(), use setUpClass() to initialize class
for testing. it turns out in pytest > 4, __init__() is called for the
test class but the attributes of the instantiated class is in turn overriden.
Kefu Chai [Thu, 8 Oct 2020 07:13:36 +0000 (15:13 +0800)]
tools/setup-virtualenv.sh: pass --use-feature=2020-resolver to pip
as long as pip supports this option, pass it to `pip install`
to silence warnings and errors like:
ERROR: After October 2020 you may experience errors when installing or updating packages. This is because pip will change the way that it resolves dependency conflicts.
We recommend you use --use-feature=2020-resolver to test your packages with the new resolver before it becomes the default.
autopep8 1.5.4 requires pycodestyle>=2.6.0, but you'll have pycodestyle 2.5.0 which is incompatible.
pytest-cov 2.10.1 requires pytest>=4.6, but you'll have pytest 3.10.1 which is incompatible.
pybind/mgr/dashboard: move pytest into requirements.txt
before this change, pytest is included by both requirements-lint.txt
and requirements-test.txt. this fails the install-deps.sh script when
collecting the python package wheels:
Dan van der Ster [Thu, 29 Apr 2021 23:06:17 +0000 (01:06 +0200)]
mgr/progress: ensure progress stays between [0,1]
If _original_pg_count is 0 then progress can be negative.
Fixes: https://tracker.ceph.com/issues/50591 Related-to: https://tracker.ceph.com/issues/50587 Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
(cherry picked from commit 20990a94598d0249745e2ec25c9197d842119d92)