* refs/pull/35759/head:
qa: fix flake8 warnings
doc: add documentation for new ephemeral pinning feature
pybind/mgr/volumes: wire up pinning subvolumes/subvolumegroups
qa: adapt tests for empty pinned dir export
qa: break export pin tests into discrete tests
qa: add more ephemeral pin tests
qa: add tests for ephemeral pinning
mds: add maximum random ephemeral pin percentage
mds: replicate random pin state
mds: finish implementation of ephemeral pins
mds: do string equality comparison
mds: add ephemeral pinning for subtrees
mds: trim pinned and empty subtrees
mds: refactor remove_subtree
mds: allow export of pinned directory if empty
mds: reduce subtree processing verbosity
mds: skip export of empty directories
mds: remove frozen export pin from queue
mds: simplify for loop construction
mds: add debug messages for export queue processing
qa: refactor _wait_subtree and _get_subtree
qa: use status from wait_for_daemons
qa: quietly print json output from asok commands
mgr/dashboard: Prometheus query error in the metrics of Pools, OSDs and RBD images Fixes: https://tracker.ceph.com/issues/45068 Signed-off-by: Avan Thakkar <athakkar@redhat.com>
(cherry picked from commit 47b515c09496da8fc326300bab6618250466effe)
This is slightly evil in its current form. The MDS should use locks to
transmit state changes but right now it's just set when the CInode is
replicated. This replication of this state marker is necessary for
failover situations where we want the randomly pinned subtree to remain
pinned across failovers.
Note: this problem does not exist for the ephemeral distributed pins
because simple knowledge of the immediate parent's setting (which is
replicated normally) is sufficient to determine if the CInode is
ephemerally distributed. Ditto for regular export pins.
The string::find method would return true for ceph.dir.pin even for the
other ephemeral pin xattr names. For this reason, it was never possible
to actually turn ephemeral pins on!
This PR introduces inode xattrs export_ephemeral_random and
export_ephemeral_distributed which enables two different metadata
distribution strategies - the first being suitable for a more depthwise
scaling of metadata (height of the tree keeps increasing) and the latter
for horizontal scaling (many subtrees under a single parent).
export_ephemeral_distributed applies is not hierarchical. Any direct
descendant directory (i.e. a child directory) has an ephemeral export
pin applied to it according to a consistent hash of the child directory
inode number. export_ephemeral_distributed is hierarchical like
"export_pin". Any CDir loaded into the cache may be ephemerally pinned
to a random rank. Like "export_ephemeral_distributed", the random rank
is determined by a consistent hash.
The metadata distribution strategies are facilitated by using John
Lamping and Eric Veach's Jump Consistent Hashing as the consistent hash
algorithm. This hashing algorithm eliminates the need to store the data
structures representing the consistent hash cluster state and performs
as well as Akamai's original implementation providing a fairly uniform
distribution. This algorithm only works for distributed systems with
numbered buckets (nodes) arranged in ascending order and cluster resizes
does not produce any holes in the arrangement of nodes i.e (0, 1, 2, 3)
--[removing node 1]--> (0, 1, 2). CephFS satisfies these conditions as
the MDSs are arranged as numbered ranks and cluster modifications does
not produce any holes in the resulting arrangement of ranks.
Fixes: https://tracker.ceph.com/issues/41302 Signed-off-by: Sidharth Anupkrishnan <sanupkri@redhat.com> Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit ced15ed7ef70ff832d4bebedecb89944276b0395)
Patrick Donnelly [Tue, 30 Jun 2020 21:22:15 +0000 (14:22 -0700)]
Merge PR #35809 into octopus
* refs/pull/35809/head:
qa/tasks/vstart_runner.py: be python3 compatible
doc/dev/developer_guide: use python3 to launch vstart_runner.py
pybind/mgr/dashboard/run-backend-api-tests.sh: use python3 by default
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
This commit adds the dmcrypt support in `ceph-volume raw` mode.
Note about `ceph-volume raw list` change:
Given `lsblk -J` (json output) option isn't available on all OS, I came up with
adding '--inverse' option to the existing command which allows us to get the
mapper devices list in that command output. Not listing root devices containing
partitions shouldn't have side effect since we are in `ceph-volume raw`
context.
example:
running `lsblk --paths --nodeps --output=NAME --noheadings` doesn't allow to
get the mapper list because the output is like following :
adding `--inverse` is a trick to get around this issue, the counterpart is that
we can't list root devices if they contain at least one partition but this
shouldn't be an issue in `ceph-volume raw` context given we only deal with
raw devices.
Casey Bodley [Tue, 26 May 2020 19:03:03 +0000 (15:03 -0400)]
rgw: sanitize newlines in s3 CORSConfiguration's ExposeHeader
the values in the <ExposeHeader> element are sent back to clients in a
Access-Control-Expose-Headers response header. if the values are allowed
to have newlines in them, they can be used to inject arbitrary response
headers
this issue only affects s3, which gets these values from an xml document
in swift, they're given in the request header
X-Container-Meta-Access-Control-Expose-Headers, so the value itself
cannot contain newlines
Signed-off-by: Casey Bodley <cbodley@redhat.com> Reported-by: Adam Mohammed <amohammed@linode.com>
Nathan Cutler [Wed, 24 Jun 2020 19:08:40 +0000 (21:08 +0200)]
doc: PendingReleaseNotes: clean slate for 15.2.5
All of these Pending Release Notes have been included in the official
15.2.4 Release Notes, so keeping them in this file any longer would be
counterproductive.
Ernesto Puerta [Mon, 11 May 2020 18:33:25 +0000 (20:33 +0200)]
mgr/dashboard: work with RBD images v1
Add support for RBD Image Format v1:
- This format lacks ID field, required for dashboard. Instead,
RBD image `block_name_prefix` is used as unique ID (together with pool
id and namespace)
- Additionally, `image_format` is now exposed.
- In the front-end side:
- Copy action on a v1 image will cause the image to be copied to v2
format.
- List doesn't allow Move to Trash on v1 images,
- Details section now shows `image_format` for images,
- Edit Form disables flags not supported for v1 (`deep-flatten`,
`layering`, `exclusive-lock`).
- Protect does not work on v1 images or v2 images created from v1
ones.
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/block/rbd-details/rbd-details.component.html: keep only new row
src/pybind/mgr/dashboard/services/ceph_service.py: keep only 1 method, add import Union
Tiago Melo [Sat, 9 May 2020 02:28:39 +0000 (02:28 +0000)]
mgr/dashboard: Fix random E2E error in mgr-modules
This test failed at random times when it tried to find the new value of pool_ids
in the balancer module.
This happened because the value of pool_ids is automatically reverted by ceph,
so in some situations when we tried to read the new value,
it was already reverted and failed.
Enhanced the tests to be able to use any text input, not only the ones with
empty default values.
Now subscribe only emits if the value is not undefined,
removing the need to do the validation after each subscribe.
Add a new subscribeOnce method, which replaces getCurrentSummary.
Like the previous one, this will only emit when there is a non undefined value,
and will only emit once.
After it emits, the subscription is closed automatically.
Alfonso Martínez [Fri, 22 May 2020 11:36:10 +0000 (13:36 +0200)]
mgr/dashboard: grafana panels for rgw multisite sync performance
* RGW sync perf. counters are now exposed through grafana panels.
* Sync Performance tab is only shown if rgw realm is detected.
* Prometheus module: added metrics suitable for prometheus consumption (from existing ones, not replacing for backward compatibility).
Fixes: https://tracker.ceph.com/issues/45310 Signed-off-by: Alfonso Martínez <almartin@redhat.com>
(cherry picked from commit cf4ff7d2f03bc285a3fae3f27577333f11dab58a)
Conflicts:
src/pybind/mgr/dashboard/run-frontend-e2e-tests.sh
There was an extra square bracket in octopus that caused the conflict.
This was manually fixed, since the commit that removed it will not be
backported.
mgr/volumes: Create subvolume with isolated rados namespace
1. Add --namespace-isolated option to 'subvolume create' command
to create subvolume in a separate RADOS namespace
2. Add "pool_namespace" field to 'subvolume info' command
which displays the rados namespace if set else empty string
"ceph fs subvolume snapshot info <vol_name> <sub_name> <snap_name> [<group_name>]"
The output is in json format with following fields
created_at: time of creation of snapshot in the format "YYYY-MM-DD HH:MM:SS:ffffff"
data_pool: data pool the snapshot belongs to
has_pending_clones: "yes" if snapshot clone is in progress otherwise "no"
protected: "yes" if snapshot is protected otherwise "no"
size: snapshot size in bytes
Nizamudeen A [Wed, 22 Apr 2020 11:23:41 +0000 (16:53 +0530)]
mgr/dashboard: Asynchronous unique username validation for User Component
Implements an asynchronous validation for the username field in the Create User form which immediately display an error message if the username already exists.