for slo, we've already calculated this as 'lo_etag' in get_params()
for dlo, the local 'etag' already contains the hash of an empty string
the calls to complete_etag() were calling hash.Final() a second time on
the same hash without a hash.Restart() in between. this apparently
worked fine with NSS, but with OpenSSL the second call to Final()
returns a different value
rgw: orphans find: don't process stale bucket instances
As a large bucket might have resharded multiple times, check the cur bucket info
and ensure that no reshard is in progress before we attempt to log bucket index
entries. On a large sized bucket, since a bucket would have undergone reshard
multiple times, this avoids wasteful processing of stale bucket instance entries
rgw: orphan: introduce a detailed mode (off by default)
We currently stat objects that fit in a head as well and also log them, since we
skip head objects anyway in the rados list output this commit avoids logging
these objects if the object size itself is less than the manifest head size.
Additionally we avoid the stat call itself from the list object output when the
object fits within the chunk size. This behaviour can be unset by setting the
detailed mode which can help in older clusters where the head used to have a
different size.
The old behaviour in both the cases can be turned on by setting the detailed
flag which can be passed on from rgw-admin. Avoiding stat calls and not logging
the head objects significantly reduces the IO activity on clusters which have a
huge percentage of objects that fit in a head.
rgw: orphans tool: align with rgw list bucket min readahead
At rgw::rados layer we read upto `min readahead` entries anyway and then pass on
only the requested amount to the caller. Since this translates down to a cls
call requesting a 1000 omap keys by default, it makes sense not to waste the
entries, and process them
An radosgw-admin lc fix --bucket <> option is added which checks if the bucket
entry exists in the corresponding lc shard and creates it if not. In case of
resharded buckets not running a fixed rgw that writes/compares the marker this
would write a new entry with the marker as the old entry would've already been
deleted by a LC process. We currently don't cleanup the stale entry as it is
assumed this would be picked up by the LC processor already or would be picked
up in the next cycle.
Since buckets can undergo resharding which changes the bucket id, using the
bucket marker in the shard id can help prevent the need to rewrite the entry as
the buckets get resharded. This also helps detect the exit criteria when the
bucket gets deleted.
Matt Benjamin [Fri, 8 Mar 2019 20:41:05 +0000 (15:41 -0500)]
rgw: prefix-delimiter listing: support >1 character delimiter
Fix prefix and CommonPrefix extraction logic in
RGWRados::Bucket::List::list_objects_ordered so as to permit
arbitrary-length string delimiters.
Fixes: https://tracker.ceph.com/issues/24821 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit e3c1ea244234aace7368d5a5ee95af2f6a529b00)
Sage Weil [Tue, 9 Apr 2019 22:12:37 +0000 (17:12 -0500)]
mgr/DaemonServer: prevent pgp_num reductions from outpacing pg_num merges
If we are merging lots of pgs down to a much smaller number of pgs, and
the pgs are able to move quickly (faster than the merges happen), we can
end up with too many pgs on a small number of osds, triggering the max
pgs per osd limits.
Avoid this by preventing the pgp_num reductions from getting too far
out in front of the merges themselves. Basically, cap the delta between
pgp_num and pg_num to the max_misplaced ratio. We are already limiting
the movement caused by pgp_num by max_misplaced; this effectively just
makes sure that the actual merging (and pg_num reductions) are keeping
up.
cephfs-shell: Change type of d_name to bytes array
By reverting commit 5106582 'd_name' is always a bytes array. This produces
type error wherever 'd_name' is used with 'str' type. In such cases decode it.
qa/tasks/ceph_deploy: install python3.6 instead of python3.4 for py3 tests
EPEL7 has switched over to python3.6 as the main python3. and we started
packaging python bindings for python3.6 since
https://github.com/ceph/ceph-build/pull/1283
rpm: add "Provides: python3-*" for python packages
so user can install python3-rados, instead of python36-rados, without
specifying the minor version of python. also, we should not break our
teuthology tests with this naming scheme change. for instance, our
cephfs qa suite installs `python3-cephfs` for testing the `cephfs-shell`
some of our centos7 jenkins builders are failing to build ceph master and
nautilus branches. because EPEL7 recently switched from python3.4 to
python3.6 as the native python3. see
https://lists.fedoraproject.org/archives/list/epel-announce@lists.fedoraproject.org/message/EGUMKAIMPK2UD5VSHXM53BH2MBDGDWMO/
and one of our BuildRequires, cmake3,
was offered by EPEL7. it also followed the python3.6 switch-over to
rebuild against python3.6. as a result, the cmake3-data-3.13.4-2.el7
started to depend on /usr/bin/python3.6, which is in turn offered by
python36 package. after installing python36 as a dependency of the
updated cmake3. but in cmake, we originally checks for the latest
python3 interpreter if WITH_PYTHON3 is enabled, that's why these
builders which happen to install these updated packages started to fail
when detecting the existence of python3.6 related build dependencies.
as a fix, in d1e83082,
python%{python3_pkgversion}-{devel,setuptools,Cython} are listed as
BuildRequires to reflect this change in EPEL7. before d1e83082, we
hardwired them to python34-*.
but as following analysis puts, there are cases where `yum-builddep`
is inconsistent with `rpmbuild`. as `yum-builddep` changes the how
`python3_pkgversion` and `python3_version` macros are expanded:
- none of the packages installed by `yum-builddep` installs the python3
related rpm macros, so the system stays with whatever python3 it was
using. in this case, `rpmbuild` won't complain, as the
`python3_pkgversion` and `python_version` are consistent before and
after `yum-builddep`.
- system has python3.4 installed before `yum-builddep`. but
`yum-builddep` installed python3.6 and also the updated
`python-rpm-macros` packages, which points `python3_version` and
`python3_pkgversion` to 3.6 and 36 respectively. in this case,
`rpmbuild` will complain, because when we run `yum-builddep`,
`python3_version` was still "3.4".
- system does not have python3 installed before `yum-builddep`. so
it was using python34 for preparing the "BuildRequires". but some
of the packages installed by `yum-builddep` installs python36, and
also the updated `python-rpm-macros` packages, which points
`python3_version` and `python3_pkgversion` to 3.6 and 36 respectively.
in this case, `rpmbuild` will complain, because the python36 related
dependencies are missing. what the system has is python34
dependencies.
- system does not have python3 installed before `yum-builddep`. so
it was using python34 for preparing the "BuildRequires". but some
of the packages installed by `yum-builddep` installs python34, and
also the updated `python-rpm-macros` packages, which points
`python3_version` and `python3_pkgversion` to 3.4 and 34 respectively.
in this case, `rpmbuild` won't complain, as the
`python3_pkgversion` and `python_version` are also consistent before and
after `yum-builddep`.
as we cannot tell if the system has python3 or what the python3 version
the system has before `yum-builddep`, so what we can do is to ensure
`rpmbuild` has what it needs to build Ceph. so let's just stick with
python3.6.
Boris Ranto [Thu, 4 Apr 2019 20:00:55 +0000 (22:00 +0200)]
cmake: check for MAJOR.MINOR version of python3
We can only check for MAJOR.MINOR version of python3 since
FindPython3Libs does not support checking for MAJOR.MINOR.PATCH version
of python3. We also need to make sure we use the PYTHON3 versions of
these variables.
This should fix a regression introduced by c961e00.
lishuhao [Mon, 28 Jan 2019 08:05:52 +0000 (16:05 +0800)]
rgw: make rgw admin ops api get user info consistent with the command line
GET /{admin}/user?format=json HTTP/1.1
Host: {fqdn}
This api gets the information is incomplete relative to radosgw-admin user info --uid xxxx
This modification will change the information returned by the three calls :
RGWUserAdminOp_User::info
RGWUserAdminOp_User::create
RGWUserAdminOp_User::modify
Sage Weil [Wed, 10 Apr 2019 21:49:17 +0000 (16:49 -0500)]
Merge PR #27387 into nautilus
* refs/pull/27387/head:
mgr/pg_autoscaler: apply bias to pg_num selection
mgr/pg_autoscaler: include pg_autoscale_bias in autoscale-status table
osd/osd_types,mon: add pg_autoscale_bias pool property
Sage Weil [Wed, 10 Apr 2019 21:48:16 +0000 (16:48 -0500)]
Merge PR #27440 into nautilus
* refs/pull/27440/head:
os/filestore/FileJournal: note EIO events
os/filestore: make note of EIO errors when we see them
os/filestore: note devname for later use
global/signal_handler: avoid core dump on EIO
os/bluestore/KernelDevice: note EIO metadata on aio EIO
global: add hook to annotate crash report with EIO information
After PR https://github.com/ceph/ceph/pull/26572, when RGW is not
configured, accessing /rgw drop-down (daemons, users or buckets)
results in nothing apparently happening (not even an error).
Under the curtains, what is happening is that the ModuleStatusGuard
has redirected the route to the rgw/501, but as this route is now
under parent rgw route handler, which sets CanActivateChild guards,
this results in a new ModuleStatusGuard invokation, a subsequent
failure and a new redirection to rgw/501.
Several approaches could be taken here:
- Remove error pages from lazy-loaded modules. Probably it does not
make sense to have a 501 page per component.
- Add some whitelist to avoid this kind of loop (e.g.: 501, or any
error page).
- Set a max number of redirections (cautionary measure).
Ernesto Puerta [Tue, 26 Mar 2019 18:01:01 +0000 (19:01 +0100)]
mgr/dashboard: unify button/URL actions naming
- Mappings (actually an Enum) created for actions (buttons and other UI elements) and URLs: ActionLabels and URLVerbs.
- An alternative would be to fix/improve the current i18n-polyfill, which only works with literal strings (not even with 'const enums' which become literals after Typescript transpiling).
- Additionally having a predefined file with some strings to translate (actions, verbs, etc) could improve on the 1st of the 2-stage i18n process (as extraction tool has a lot of limitations).
- A corresponding ActionLabelsI18n service with translated labels (it's a service as I haven't found the way to either translate no-const strings (ngx-translate/AST parser failure) or get a static translator).
- This services could/should be extended to cover all strings that are defined in static/globally scoped objects before any I18n provider has been initialized.
- Breadcrumbs are not translated (neither were they before this change). This part remains untackled: using 'proxy' static objects and performing live translation could deal with the issue.
- New URLBuilder service created (following a established pattern in the Java/.NET world) . This should avoid the need of messing with literal URLs and string composition/parsing, and while the front-end is not meant to be consumed by anyone, Angular does not provide any other way for the app to navigate between components, so the URLs are a de-facto interface contract. Unlike this approach is not flawless, it's easier to enforce, while issues coming from free-from strings are really hard to catch.
- This could be further improved by using a router registry/dynamic routing. Most of the routes are trivial.
- As a side effect of these changes, routing module has been refactored and some routes moved to their specific modules (pool, rbd, rgw), via loadChildren and routes.forChild() magic. Now the above mentioned components are lazy-loaded/pre-loaded (it means right after the main code is loaded). This should also decrease the loading time (though probably this is not biggest time eater here).
- As now modules can be loaded multiple times, not only from App module by means of lazy loading, but also from other ones (as PoolModule loads BlockModule to get QoS widgets in Pool windows), now lazy loaded modules include 2 NgModules (one with imports: RouterModule.forChild(routes), meant for lazy-loading, and another without routes).
- Caveat: Some parts might not be (fully) translated (NFS, iSCSI, mirroring), as there's been ongoing work on them and it's hard to keep up with the new code.
These changes will be a waste of time if the new code does not take benefit from/adheres to it, so I'm still figuring out how to spread this (nothing really fancy to demo). Maybe adding some checks/harnessing to enforce the new naming convention (ideas greatly welcome here).