Conflicts:
src/pybind/cephfs/cephfs.pyx
- certain exceptions were missing in nautilus
src/test/pybind/test_cephfs.py
- some cephfs pybind interfaces and their tests were missing in nautilus
rgw: orphan list teuthology test & fully-qualified domain issue
Sometimes when teuthology machines are provisioned, the command
`hostname --fqdn` does not provide a fully qualified domain name but
instead just the hostname (e.g., smithi149 instead of
smithi149.front.sepia.ceph.com). This prevents the teuthology test for
rgw-orphan-list from running successfully [for example, the hostname
was for some reason mis-interpreted as the bucket name in the
request].
This commit checks whether the hostname derived from `hostname --fqdn`
contains any '.'s and if it does not, it will append
".front.sepia.ceph.com" to the hostname. This is a hack, but until
teuthology machines are configured appropriately it seems to be a
reasonable work-around.
rgw: rgw-orphan-list -- fix interaction, quoting, and percentage calc
The interactive mode wasn't working due to prompts going to stdout
instead of stderr. If a space should appear in temporary file, it will
generate a shell error, so quoting was added. Furthermore if there are
no objects found in a pool, a divide by zero error will be
generated. This commit addresses these issues.
J. Eric Ivancich [Tue, 23 Jun 2020 21:05:59 +0000 (17:05 -0400)]
rgw: orphan-list timestamp fix
When creating intermediate and output files, the rgw-orphan-list
script uses a timestamp using the `date` command. The hour was
inserted with "%k" but that padds with a space rather than a zero. So
that's changed to "%H".
qa/rgw: integration test for `rgw-orphan-list` & `radosgw-admin radoslist`
Add teuthology test for `rgw-orphan-list` in a new tool suite under
rgw. It only needs to be tested under one configuration. And the new
tool sub-suite can be used by other tooling int he
future. radosgw-admin `radoslist` is tested indirectly through
`rgw-orphan-list` and therefore does not need its own test.
J. Eric Ivancich [Tue, 21 Apr 2020 15:28:58 +0000 (15:28 +0000)]
qa/rgw: allow the rgw teuthology task to capture/set dns names
A teuthology workunit might want to use the rgw task, setting the
rgw-dns-name and/or rgw-dns-s3website-name configuration options to
the fully-qualified domain name. Existing code implies that setting
these configuration options to the empty string will do that. However
the current logic does not support that given it has Python
conditionals that treat the empty string as false. This fixes that.
Now the following teuthology tasks YAML will work as expected:
Adds a radosgw-admin subcommand and possibly a bucket name (if not,
all buckets are assumed) and walks the bucket(s) listing(s) and the
manifest (if it exists) for each object in the listing to generate the
rados objects that represent the rgw objects in the bucket.
Also adds a tool named rgw-orphan-list that will produce a list in a
local file of what appear to be rgw orphans.
NOTE: This is not a cherry-pick from master because the feature was
originally written for an older version of Ceph and is being gradually
forward-ported to newer and newer versions.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
Ernesto Puerta [Mon, 11 May 2020 18:33:25 +0000 (20:33 +0200)]
mgr/dashboard: work with RBD images v1
Add support for RBD Image Format v1:
- This format lacks ID field, required for dashboard. Instead,
RBD image `block_name_prefix` is used as unique ID (together with pool
id and namespace)
- Additionally, `image_format` is now exposed.
- In the front-end side:
- Copy action on a v1 image will cause the image to be copied to v2
format.
- List doesn't allow Move to Trash on v1 images,
- Details section now shows `image_format` for images,
- Edit Form disables flags not supported for v1 (`deep-flatten`,
`layering`, `exclusive-lock`).
- Protect does not work on v1 images or v2 images created from v1
ones.
Avan Thakkar [Tue, 25 Feb 2020 08:49:10 +0000 (14:19 +0530)]
mgr/dashboard: add popover list of managers in landing page Fixes: https://tracker.ceph.com/issues/42979 Signed-off-by: Avan Thakkar <athakkar@redhat.com>
(cherry picked from commit cdfeb1d196c7d47340baae2be5910b90c889e778)
Conflicts:
src/pybind/mgr/dashboard/controllers/health.py
-removed few lines as those lines were removed in the master branch too
src/pybind/mgr/dashboard/frontend/src/app/ceph/dashboard/health/health.component.html
-added the braces missing
there is chance that we have a PG just created but fails to peered
before a mgr module retrieves the health report from mgr. in that
case, the "last_peered" field is not set, as that pg has not peered. but
normally, the newly created PG will be active+clean in couple seconds
which is way under the default setting of mon_pg_stuck_threshold (60
seconds).
so in this change, if the "last_whatever" is not set, we also use the
"last_changed" as a reference to see if the PG is healthy, and only
consider PG stuck if the last_changed is also too old.
Improve cache by running requests in a thread and prevent multiple
requests to Ceph from multiple sources (e.g. Prometheus instances) which
increase load on the manager.
Fixes: https://tracker.ceph.com/issues/45554 Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit fa69d8e1112d2688e2979ab10c303d1facb6bc76)
Conflicts:
src/pybind/mgr/prometheus/module.py
- line 1096 if condition block changed
- second commit "mgr/prometheus: enable mypy type checking for prometheus module"
is not required for the backport as discussed with Patrick
who is the creator of the original PR.
mgr/dashboard: Prometheus query error in the metrics of Pools, OSDs and RBD images Fixes: https://tracker.ceph.com/issues/45068 Signed-off-by: Avan Thakkar <athakkar@redhat.com>
(cherry picked from commit 47b515c09496da8fc326300bab6618250466effe)
This commit adds the dmcrypt support in `ceph-volume raw` mode.
Note about `ceph-volume raw list` change:
Given `lsblk -J` (json output) option isn't available on all OS, I came up with
adding '--inverse' option to the existing command which allows us to get the
mapper devices list in that command output. Not listing root devices containing
partitions shouldn't have side effect since we are in `ceph-volume raw`
context.
example:
running `lsblk --paths --nodeps --output=NAME --noheadings` doesn't allow to
get the mapper list because the output is like following :
adding `--inverse` is a trick to get around this issue, the counterpart is that
we can't list root devices if they contain at least one partition but this
shouldn't be an issue in `ceph-volume raw` context given we only deal with
raw devices.
Conflicts:
- src/pybind/mgr/dashboard/frontend/src/app/ceph/block/rbd-configuration-list/rbd-configuration-list.component.html
Formatting change in the master, use the code in the master to fix. Signed-off-by: Kiefer Chang <kiefer.chang@suse.com>
mon/OSDMonitor: Reset grace period if failure interval exceeds a threshold.
Reset the grace hearbeat period if there have been no failures since the
set threshold value (48 Hrs). The mon_osd_laggy_halflife value is
leveraged to calculate the threshold.
A couple of helper functions do the following:
- get_grace_interval_threshold():
Calculates and returns the grace interval threshold value.
- grace_interval_threshold_exceeded(int):
Checks if grace interval threshold is exceeded based on the last
down stamp.
- set_default_laggy_params(int):
Resets the laggy_probability and laggy_interval in the
new_xinfo structure maintained within pending_inc to be applied
eventually as part of update from paxos.
The threshold value is checked and the laggy parameters are reset at the
following point,
- encode_pending() - If an existing osd is experiencing failure
after an interval exceeding the failure threshold period.
Casey Bodley [Tue, 14 Jan 2020 14:42:52 +0000 (09:42 -0500)]
qa/rgw: remove test against hadoop v2.8.5
the hadoop branch rel/release-2.8.5 fails to build with:
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:37 min
[INFO] Finished at: 2020-01-14T13:09:02Z
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (create-parallel-tests-dirs) on project hadoop-aws: An Ant BuildException has occured: Unable to create javax script engine for javascript
Casey Bodley [Tue, 26 May 2020 19:03:03 +0000 (15:03 -0400)]
rgw: sanitize newlines in s3 CORSConfiguration's ExposeHeader
the values in the <ExposeHeader> element are sent back to clients in a
Access-Control-Expose-Headers response header. if the values are allowed
to have newlines in them, they can be used to inject arbitrary response
headers
this issue only affects s3, which gets these values from an xml document
in swift, they're given in the request header
X-Container-Meta-Access-Control-Expose-Headers, so the value itself
cannot contain newlines
Signed-off-by: Casey Bodley <cbodley@redhat.com> Reported-by: Adam Mohammed <amohammed@linode.com>
Nathan Cutler [Wed, 24 Jun 2020 19:08:40 +0000 (21:08 +0200)]
doc: PendingReleaseNotes: clean slate for 14.2.11
All of these Pending Release Notes have been included in the official
14.2.10 Release Notes, so keeping them in this file any longer would be
counterproductive.
Without a msg throttler, we can't change osd_client_message_cap cap.
The throttler is designed to work with 0 as a max, so change the
default to 0 to disable it by default instead.
This doesn't affect the default behavior, it only lets us use this
option again.
Fixes: https://tracker.ceph.com/issues/46143
Conflicts:
src/ceph_osd.cc - new style of gconf() access
"ceph fs subvolume snapshot info <vol_name> <sub_name> <snap_name> [<group_name>]"
The output is in json format with following fields
created_at: time of creation of snapshot in the format "YYYY-MM-DD HH:MM:SS:ffffff"
data_pool: data pool the snapshot belongs to
has_pending_clones: "yes" if snapshot clone is in progress otherwise "no"
protected: "yes" if snapshot is protected otherwise "no"
size: snapshot size in bytes
Alfonso Martínez [Thu, 23 Jan 2020 10:16:27 +0000 (11:16 +0100)]
ceph.spec.in: fix 'make check' deps for centos8
When running 'FOR_MAKE_CHECK=1 ./install-deps.sh' in CentOS 8
these dependencies were not being installed.
Missing dependencies are provided by
https://copr.fedorainfracloud.org/coprs/ktdreyer/ceph-el8/
Kefu Chai [Tue, 24 Dec 2019 05:17:55 +0000 (13:17 +0800)]
ceph.spec.in: re-enable "make check" deps for el8
this change partially reverts e92cb7a0. as these packages are now
available in AppStream, BaseOS or PowerTools in el8, in this change,
they are re-enabled.
Paul Cuzner [Mon, 11 May 2020 21:22:07 +0000 (09:22 +1200)]
mgr/k8sevents:sanitise kubernetes events
Kubernetes can generate events without a timestamp
or an event count. When this occurs the k8sevents 'ls'
command fails due to None type values. This patch
sanitises the event received before adding to the
internal data structures to account for these
issues.