common/bl, *: deprecate list::claim() in favor of operator=(list&&).
The motivation is that `claim(list&)` seems to actually be a pre-C++11
counterpart of the already available `operator=(list&&)`.
This commit deprecates the `claim()` method but doesn't drop it yet.
All occurrences of `buffer::list::claim(list&)` are switched to
* `list::operator=(list&&)` or
* reworked to use `list::list(list&&)` instead.
Kefu Chai [Fri, 26 Jun 2020 14:26:40 +0000 (22:26 +0800)]
json_spirit: avoid using bind placeholders in global namespace
to silence the warning from boost v1.73, like
json_spirit/json_spirit_reader.cpp:7:
/opt/ceph/include/boost/bind.hpp:36:1: note: ‘#pragma message: The practice of declaring the Bind placeholders (_1, _2, ...) in the global namespace is deprecated. Please use <boost/bind/bind.hpp> +
using namespace boost::placeholders, or define BOOST_BIND_GLOBAL_PLACEHOLDERS to retain the current behavior.’
36 | BOOST_PRAGMA_MESSAGE(
| ^~~~~~~~~~~~~~~~~~~~
Kefu Chai [Fri, 26 Jun 2020 09:48:42 +0000 (17:48 +0800)]
script/gen_static_command_descriptions: add mock for rados python binding
to address issue like
Traceback (most recent call last):
File "/home/jenkins-build/build/workspace/ceph-pr-docs/doc/scripts/../../src/script/gen_static_command_descriptions.py", line 168, in <module>
print(json.dumps(gen_commands_dicts(), indent=2, sort_keys=True))
File "/home/jenkins-build/build/workspace/ceph-pr-docs/doc/scripts/../../src/script/gen_static_command_descriptions.py", line 163, in gen_commands_dicts
comms = from_mon_commands_h() + from_mgr_modules()
File "/home/jenkins-build/build/workspace/ceph-pr-docs/doc/scripts/../../src/script/gen_static_command_descriptions.py", line 85, in from_mgr_modules
comms = sum([list_mgr_module(name) for name in names], [])
File "/home/jenkins-build/build/workspace/ceph-pr-docs/doc/scripts/../../src/script/gen_static_command_descriptions.py", line 85, in <listcomp>
comms = sum([list_mgr_module(name) for name in names], [])
File "/home/jenkins-build/build/workspace/ceph-pr-docs/doc/scripts/../../src/script/gen_static_command_descriptions.py", line 65, in list_mgr_module
mgr_mod = __import__(m_name, globals(), locals(), [], 0)
File "/home/jenkins-build/build/workspace/ceph-pr-docs/src/pybind/mgr/volumes/__init__.py", line 2, in <module>
from .module import Module
File "/home/jenkins-build/build/workspace/ceph-pr-docs/src/pybind/mgr/volumes/module.py", line 8, in <module>
from .fs.nfs import NFSCluster, FSExport
File "/home/jenkins-build/build/workspace/ceph-pr-docs/src/pybind/mgr/volumes/fs/nfs.py", line 6, in <module>
from rados import TimedOut
ImportError: cannot import name 'TimedOut'
Matthew Oliver [Fri, 26 Jun 2020 00:15:12 +0000 (00:15 +0000)]
cephadm: ceph-iscsi remove pool from cap
When we create a ceph-iscsi daemon/continer in cephadm we create a user
and set some caps. Turns out we were a little too restrictive.
We were locking down to only access the pool that was given in the spec,
which happens to be the pool the iscsi config is stored. But in reality
we need to be able to attach any rbd images which could exist in other
pools.
So this patch removes the `pool=` from the osd cap, so from:
osd = allow rwx pool={spec.pool}
To:
osd = allow rwx
Fixes: https://tracker.ceph.com/issues/46138 Signed-off-by: Matthew Oliver <moliver@suse.com>
This commit adds the dmcrypt support in `ceph-volume raw` mode.
Note about `ceph-volume raw list` change:
Given `lsblk -J` (json output) option isn't available on all OS, I came up with
adding '--inverse' option to the existing command which allows us to get the
mapper devices list in that command output. Not listing root devices containing
partitions shouldn't have side effect since we are in `ceph-volume raw`
context.
example:
running `lsblk --paths --nodeps --output=NAME --noheadings` doesn't allow to
get the mapper list because the output is like following :
adding `--inverse` is a trick to get around this issue, the counterpart is that
we can't list root devices if they contain at least one partition but this
shouldn't be an issue in `ceph-volume raw` context given we only deal with
raw devices.
Casey Bodley [Tue, 26 May 2020 19:03:03 +0000 (15:03 -0400)]
rgw: sanitize newlines in s3 CORSConfiguration's ExposeHeader
the values in the <ExposeHeader> element are sent back to clients in a
Access-Control-Expose-Headers response header. if the values are allowed
to have newlines in them, they can be used to inject arbitrary response
headers
this issue only affects s3, which gets these values from an xml document
in swift, they're given in the request header
X-Container-Meta-Access-Control-Expose-Headers, so the value itself
cannot contain newlines
Signed-off-by: Casey Bodley <cbodley@redhat.com> Reported-by: Adam Mohammed <amohammed@linode.com>
Kefu Chai [Thu, 25 Jun 2020 11:54:55 +0000 (19:54 +0800)]
install-deps.sh: always use python3-sphinx to build the docs
python-sphinx creates a symlink pointing ../share/sphinx/scripts/python2/sphinx-build
to /usr/bin/sphinx-build even if python3-sphinx is already installed.
in that case, if python-sphinx is installed after python3-sphinx,
sphinx-build is in python2. and it breaks the build of master, as we are
using python3 syntax in conf.py since e9e17b9ceff0c862c8f29d629ad54a1dc401ed73.
and we are still using python2 as well as "python-sphinx" when building nautilus,
so if a build slave happen to compile nautilus before it is scheduled to
build master or a wip- branch, the doc build fails.
in this change, python-sphinx is uninstalled, if /usr/bin/sphinx-build
is found to be written in python2.
Kefu Chai [Thu, 25 Jun 2020 02:41:30 +0000 (10:41 +0800)]
mgr: avoid false alarm of MGR_MODULE_ERROR
mgr sends healthy report periodically, the report includes the
information whether the always-on modules are loaded or not. but the
modules are loaded with two steps:
1. load the options and command exposed by modules. the options and
commands are registered using static methods of the subclasss of
MgrModule.
2. create an instance of the subclass of MgrModule. this is performed
in background by a Finisher thread. upon finishing of the construction
of the instance, ActivePyModules::start_one() adds the module which
successfully creates the class to `modules`.
but there is chance that when mgr sends healthy report, the always-on
module is still creating its instance of MgrModule subclass, or that
task is still pending in the finisher thread. in that case, mgr would
add a false error message like
```
4 mgr modules have failed (MGR_MODULE_ERROR)
```
in the healthy report
in this change, the number of modules in pending state is tracked,
and mgr will not take the missing always-on modules into account unless
the number of pending modules is 0.
Patrick Donnelly [Thu, 25 Jun 2020 01:24:17 +0000 (18:24 -0700)]
Merge PR #30592 into master
* refs/pull/30592/head:
qa: fix flake8 warnings
doc: add documentation for new ephemeral pinning feature
mds: enable ephemeral pinning by default
pybind/mgr/volumes: wire up pinning subvolumes/subvolumegroups
qa: adapt tests for empty pinned dir export
qa: break export pin tests into discrete tests
qa: add more ephemeral pin tests
qa: add tests for ephemeral pinning
mds: add maximum random ephemeral pin percentage
mds: replicate random pin state
mds: finish implementation of ephemeral pins
mds: do string equality comparison
mds: add ephemeral pinning for subtrees
mds: trim pinned and empty subtrees
mds: refactor remove_subtree
mds: allow export of pinned directory if empty
mds: reduce subtree processing verbosity
mds: skip export of empty directories
mds: remove frozen export pin from queue
mds: simplify for loop construction
mds: add debug messages for export queue processing
qa: refactor _wait_subtree and _get_subtree
qa: use status from wait_for_daemons
qa: quietly print json output from asok commands
Reviewed-by: Mark Nelson <mnelson@redhat.com> Reviewed-by: Ramana Raja <rraja@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
This is slightly evil in its current form. The MDS should use locks to
transmit state changes but right now it's just set when the CInode is
replicated. This replication of this state marker is necessary for
failover situations where we want the randomly pinned subtree to remain
pinned across failovers.
Note: this problem does not exist for the ephemeral distributed pins
because simple knowledge of the immediate parent's setting (which is
replicated normally) is sufficient to determine if the CInode is
ephemerally distributed. Ditto for regular export pins.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
The string::find method would return true for ceph.dir.pin even for the
other ephemeral pin xattr names. For this reason, it was never possible
to actually turn ephemeral pins on!
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This PR introduces inode xattrs export_ephemeral_random and
export_ephemeral_distributed which enables two different metadata
distribution strategies - the first being suitable for a more depthwise
scaling of metadata (height of the tree keeps increasing) and the latter
for horizontal scaling (many subtrees under a single parent).
export_ephemeral_distributed applies is not hierarchical. Any direct
descendant directory (i.e. a child directory) has an ephemeral export
pin applied to it according to a consistent hash of the child directory
inode number. export_ephemeral_distributed is hierarchical like
"export_pin". Any CDir loaded into the cache may be ephemerally pinned
to a random rank. Like "export_ephemeral_distributed", the random rank
is determined by a consistent hash.
The metadata distribution strategies are facilitated by using John
Lamping and Eric Veach's Jump Consistent Hashing as the consistent hash
algorithm. This hashing algorithm eliminates the need to store the data
structures representing the consistent hash cluster state and performs
as well as Akamai's original implementation providing a fairly uniform
distribution. This algorithm only works for distributed systems with
numbered buckets (nodes) arranged in ascending order and cluster resizes
does not produce any holes in the arrangement of nodes i.e (0, 1, 2, 3)
--[removing node 1]--> (0, 1, 2). CephFS satisfies these conditions as
the MDSs are arranged as numbered ranks and cluster modifications does
not produce any holes in the resulting arrangement of ranks.
Fixes: https://tracker.ceph.com/issues/41302 Signed-off-by: Sidharth Anupkrishnan <sanupkri@redhat.com> Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>