Tiago Melo [Mon, 4 Nov 2019 16:18:27 +0000 (15:18 -0100)]
mgr/dashboard: Display the aggregated number of request
convertTimeSeries will now calculate the aggregated total number of client
request made in the last seconds, instead of the number of request per second.
The alert was triggered when less than 90% of OSDs were _up_, but then the
description took that value and described it as the percentage of OSDs being
_down_. So with 12% of OSDs down, the alert description would read:
```
88% or 88 of 100 OSDs are down (>=10%).
```
which can be panic-inducing.
This commit changes the alert expression to actually compute the ratio of OSDs
being down, which makes the correct value appear in the description.
Jason Dillaman [Mon, 11 May 2020 23:55:50 +0000 (19:55 -0400)]
pybind/rbd: RBD.create() method's 'old_format' parameter now defaults to False
The RBD v1 format has been deprecated for numerous releases and creation of
v1 format images has been disabled since the Mimic release. This fixes
the Python API's image create() method to ensure v2 images are created by
default (and no longer throw an exception that creation of v1 images are
disabled).
Fixes: https://tracker.ceph.com/issues/45504 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 45349355f4b41c6f9de594ef34a8657230113e6b)
Jason Dillaman [Tue, 12 May 2020 14:16:36 +0000 (10:16 -0400)]
librbd: copy API should not inherit v1 image format by default
When copying from a v1 image, by default the new destination image
would be created using the v1 format. Since the creation of v1 images
is disallowed, this has been updated to default to using the v2
image format.
Fixes: https://tracker.ceph.com/issues/45518 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 69b6d5997e8c6a11739e4d5a32564e61eb2f470f)
Conflicts:
src/librbd/internal.cc
- in nautilus, the call immediately preceding the change is
"src->snap_lock.put_read()" whereas in master it is
"src->image_lock.unlock_shared()"
bash_completion: Do not auto complete obsolete and hidden cmds
This patch fixes two things.
1. Do not auto complete obsolete and hidden cmds.
2. sub command completions often failed due to the
use of associative arrays which does not keep the
order. Hence used non-associative arrays.
Igor Fedotov [Mon, 3 Feb 2020 15:50:50 +0000 (18:50 +0300)]
os/bluestore: do not use 'unused' bitmap if makes no sense.
The processing logic which relies on 'unused' bitmap makes sense for
bluestore setup where min alloc size is different from device block
size. Now omitting if that's not true.
Igor Fedotov [Mon, 3 Feb 2020 15:36:21 +0000 (18:36 +0300)]
os/bluestore: fix unused 'tail' calculation.
Fixes: https://tracker.ceph.com/issues/41901 Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit c91cc3a8d689995e8554c41c9b0f652d9a3458da)
Conflicts:
src/test/objectstore/store_test.cc
- omitted test case "TEST_P(StoreTestSpecificAUSize, ReproBug41901Test)"
from the backport, because nautilus does not have the
"bluestore_debug_enforce_settings" option
Patrick Donnelly [Mon, 20 Jan 2020 19:23:09 +0000 (11:23 -0800)]
qa: log warning on scrub error
Instead of printing the (useless) traceback, just print a warning about
ignoring the failure. The traceback makes it harder to search for the
real problem in the teuthology log.
Fixes: https://tracker.ceph.com/issues/43718 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit b7454e423620e829e7417cdfca1faf5cd91dec3f)
Conflicts:
qa/tasks/mon_thrash.py
- whereas master has "self.manager.raw_cluster_cmd('mon', 'scrub')" in
the try block, in nautilus it is only "self.manager.raw_cluster_cmd('scrub')"
Sage Weil [Tue, 28 Jan 2020 19:33:49 +0000 (13:33 -0600)]
osd: dispatch_context and queue split finish on early bail-out
If we bail out of advance_pg early because there is an upcoming merge, we
still need to dispatch_context() on rctx before we drop the PG lock. And
the rctx that we submit needs to include the on_applied finisher comit
to call _finish_splits.
This is noticeable (at least) when there is a split and merge that are
both known. When we process the split, the new child is added to new_pgs.
When we get to the merge epoch, we stop early and take the bail-out
path.
Fix by adding a dispatch_context call for this path. And further make sure
that both dispatch_context callers in this function queue up the
new_pgs event.
Matthew Oliver [Wed, 26 Feb 2020 06:15:22 +0000 (06:15 +0000)]
rgw: anonomous swift to obj that dont exist should 401
Currently, if you attempt to GET and object in the Swift API that
doesn't exist and you don't pass a `X-Auth-Token` it will 404 instead of
401.
This is actually a rather big problem as it means someone can leak data
out of the cluster, not object data itself, but if an object exists or
not.
This is caused by the SwiftAnonymousEngine's, frankly wide open
is_applicable acceptance. When we get to checking the bucket or object
for user acceptance we deal with it properly, but if the object doesn't
exsit, because the user has been "authorised" rgw returns a 404.
Why? Because we always override the user with the Swift account.
Meaning as far as checks are concerned the auth user is the user, not
and anonymous user.
I assume this is because a swift container could have world readable
reads or writes and in slight s3 and swift api divergents can make these
interesting edge cases leak in.
This patch doesn't change the user to the swift account if they are
anonymous. So we can do some anonymous checks when it suits later in the
request processing path.
Fixes: https://tracker.ceph.com/issues/43617 Signed-off-by: Matthew Oliver <moliver@suse.com>
(cherry picked from commit b03d9754e113d24221f1ce0bac17556ab0017a8a)
Conflicts:
src/rgw/rgw_swift_auth.h
- where master has "rgw_user(s->account_name)", nautilus has
"s->account_name" only
Laura Paduano [Wed, 13 May 2020 12:16:57 +0000 (14:16 +0200)]
Merge pull request #34450 from rhcs-dashboard/wip-44980-nautilus
nautilus: monitoring: Fix pool capacity incorrect
Reviewed-by: Alfonso Martínez <almartin@redhat.com> Reviewed-by: Patrick Seidensal <pseidensal@suse.com> Reviewed-by: Laura Paduano <lpaduano@suse.com>
Removed because that files are not available in Nautilus:
src/pybind/mgr/dashboard/frontend/src/app/shared/services/password-policy.service.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/cluster/telemetry/telemetry.component.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/shared/smart-list/smart-list.component.ts
Sage Weil [Thu, 27 Feb 2020 15:30:27 +0000 (09:30 -0600)]
compressor/lz4: rebuild if buffer is not contiguous
In older versions of lz4 (specifically < 1.8.2) bit errors
can be introduced when compressing from fragmented memory. The lz4
bug was fixed by this lz4 commit:
The error can be reproduced using following command :
./frametest -v -i100000000 -s1659 -t31096808
It's actually a bug in the stream LZ4 API,
when starting a new stream
and providing a first chunk to complete with size < MINMATCH.
In which case, the chunk becomes a dictionary.
No hash was generated and stored,
but the chunk is accessible as default position 0 points to dictStart,
and position 0 is still within MAX_DISTANCE.
Then, next attempt to read 32-bits from position 0 fails.
The issue would have been mitigated by starting from index 64 KB,
effectively eliminating position 0 as too far away.
The proper fix is to eliminate such "dictionary" as too small.
Which is what this patch does.
This is a workaround to rebuild our input buffer into a continguos buffer
if it is not already contiguous.
Dan van der Ster [Wed, 26 Feb 2020 20:50:07 +0000 (21:50 +0100)]
test/compressor: test round trip of an osdmap
Check if the compressors can compress/decompress a bufferlist which is not word
aligned, such as a freshly-encoded osdmap.
Related-to: https://tracker.ceph.com/issues/39525 Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
(cherry picked from commit 1b1c71a2c28c38d3e28f006b1cb164435a653c02)
Conflicts:
qa/suites/rbd/openstack/workloads/devstack-tempest-gate.yaml
- some difference compared to master, but the entire test is being deleted so
I didn't examine it further
Or Friedmann [Wed, 4 Sep 2019 13:34:52 +0000 (16:34 +0300)]
fix rgw lc does not delete objects that do not have exactly the same tags as the rule
It is possible that object will have multiple tags more than the rule that applied on.
Object is not being deleted if not all tags exactly the same as in the rule.
S3-tests: ceph/s3-tests#303 Fixes: https://tracker.ceph.com/issues/41652 Signed-off-by: Or Friedmann <ofriedma@redhat.com>
(cherry picked from commit ebb806ba83fa9d68f14194b1f9886f21f7195a3d)