Jeff Layton [Wed, 3 Jun 2020 15:29:07 +0000 (11:29 -0400)]
fuse: update to newer FUSE_USE_VERSION
The build was failing for me against fuse-devel v3.9.1. The prototype
for fuse_ll_ioctl was wrong, as it was expecting the old-style one with
signed int args.
In newer libfuse versions, the prototype varies based on
FUSE_USE_VERSION. Update to a newer FUSE_USE_VERSION value to ensure
that we use the newer ioctl prototype. This also means that we need to
handle a new prototype for fuse_session_loop_mt as well.
While we're in here, move the definition of FUSE_USE_VERSION to
ceph_fuse.h so we have the definition in one place. This does mean we
need to reorganize the includes in a few places.
Fixes: https://tracker.ceph.com/issues/45866 Signed-off-by: Jeff Layton <jlayton@redhat.com>
Jason Dillaman [Thu, 28 May 2020 21:59:39 +0000 (17:59 -0400)]
librbd: restore missing flush on write-block logic
When creating the new image dispatch layer, the original flush
upon write-block was dropped. This is resulting in some random race
conditions where object IO is still in-flight when the write-block
indicates it's complete.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Wed, 27 May 2020 23:14:14 +0000 (19:14 -0400)]
librbd: exclusive lock image dispatch should not wait on IO when setting lock
IO from later dispatch layers might have caused the need to acquire the lock
(i.e. like an image refresh). In that case, the IO will be blocked waiting for
the exclusive lock to be acquired -- but will deadlock waiting for the IO
to flush.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Kefu Chai [Wed, 3 Jun 2020 01:39:26 +0000 (09:39 +0800)]
qa/tasks/vstart_runner: do not teardown test_path if "create-cluster-only"
otherwise we could be removing a "None" directory when tearing down the cluster,
and have following failure:
Exception ignored in: <bound method LocalContext.__del__ of <__main__.LocalContext object at 0x7f99fd4a6cc0>>
Traceback (most recent call last):
File "../qa/tasks/vstart_runner.py", line 1189, in __del__
shutil.rmtree(self.teuthology_config['test_path'])
File "/tmp/tmp.mmM2ugspuR/venv/lib/python3.6/shutil.py", line 477, in rmtree
onerror(os.lstat, path, sys.exc_info())
File "/tmp/tmp.mmM2ugspuR/venv/lib/python3.6/shutil.py", line 475, in rmtree
orig_st = os.lstat(path)
TypeError: lstat: path should be string, bytes or os.PathLike, not NoneType
* refs/pull/34719/head:
ceph-fuse: compatible with libfuse3.5 or higher
cmake: to get the header and library from specified path
libfuse: check the libfuse version from the pkconfig/fuse{3}.pc file
Reviewed-by: Zheng Yan <zyan@redhat.com> Reviewed-by: Kefu Chai <kchai@redhat.com>
Samuel Just [Tue, 12 May 2020 04:02:07 +0000 (21:02 -0700)]
crimson: distinguish record and block relative paddrs
Blocks get read independently of the surrounding record,
so paddr's embedded directly in a block need to refer
to other blocks within the same record by a block_relative
addr relative to the block's own offset. By contrast,
deltas to existing blocks need to use record_relative
addrs relative to the first block of the record.
This patch distinguishes the two kinds of relative paddr
(mainly for debugging purposes) and adapts cache, journal,
etc to use the appropriate types.
* refs/pull/26004/head:
mds: forward mds metrics to ceph manager w/ quering interfaces
mds: track per session client metrics
mds: record metrics from all MDSs in MDS rank 0
mds: non-rank based interface for sending message to an mds
mds: inter-mds ping-pong message and type
mgr: introduce query/report types for ceph metadata server
mds: new intra-mds message type for forwarding aggregated metrics
client: new message type for providing client side metrics
Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/34838/head:
vstart_runner: don't use namespaces by default
qa/cephfs: run nsenter commands with superuser privileges
qa/cephfs: look for mountpoint in cmdline file
Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
This follows b162541ac21e965a304ee6ffe604c43f22fa96c4.
The balancer was turned on by default in d4fbaf7, as a result of which we might see
PG_AVAILABILITY health warnings when pg-upmap-items are applied.
Kefu Chai [Sat, 30 May 2020 04:51:14 +0000 (12:51 +0800)]
rgw/reshard: use defined variable
use the defined reference for more concise code, this silences the
warning like:
```
../src/rgw/rgw_reshard.cc:530:15: warning: unused variable ‘bucket’ [-Wunused-variable]
530 | rgw_bucket& bucket = bucket_info.bucket;
| ^~~~~~
```
also move `ret` close to where it is used for the first time.
* refs/pull/34782/head:
qa/tasks/cephfs/mount.py: remove netns name parsing in mountpoint setter
qa/tasks/vstart_runner.py: add kwargs parameter to ignore the ones it does not understand
Reviewed-by: Rishabh Dave <ridave@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
The goal is to never implicitly ignore errors that the function can
return, particularly the failure on pidfile locking due to the file
being hold by another instance. This problem happened recently in
crimson-osd.
Lenz Grimmer [Tue, 2 Jun 2020 13:11:20 +0000 (15:11 +0200)]
Merge pull request #35249 from rhcs-dashboard/wip-45705-master
mgr/dashboard: add API team to CODEOWNERS
Reviewed-by: Alfonso Martínez <almartin@redhat.com> Reviewed-by: Laura Paduano <lpaduano@suse.com> Reviewed-by: Stephan Müller <smueller@suse.com> Reviewed-by: Tatjana Dehler <tdehler@suse.com> Reviewed-by: Volker Theile <vtheile@suse.com>
Rishabh Dave [Wed, 29 Apr 2020 18:10:16 +0000 (23:40 +0530)]
qa/cephfs: run nsenter commands with superuser privileges
And add a method that sets self.fuse_daemon.subproc.pid to the PID of
the process that doesn't have sudo in its arguments. For example, when
"sudo ceph-fuse /mnt/cephfs" is run on the shell, it launches process
with arguments "ceph-fuse /mnt/cephfs". The added method gets PID of
latter/child process and sets that as the fuse daemon's PID. Not doing
so kills the former/parent process but the not the child process.
Also, since we are around cleanup this method a bit.
Fixes: https://tracker.ceph.com/issues/45339 Signed-off-by: Rishabh Dave <ridave@redhat.com>
Stephan Müller [Mon, 4 May 2020 12:45:52 +0000 (14:45 +0200)]
mgr/dashboard: Use right size in pool form
Currently the max size is determined by the number of OSDs, which is
compared with the maximum of the current crush rule.
The problem with that is, that this is wrong for every crush rule that
doesn't have OSDs as failure domain and that don't have the root of the
cluster set as root of the crush rule.
Now the crush map will be used to determine how many failure domains are
really available in the cluster and how many can really be used in the
end. This number now defines the maximum size you can enter.
The crush detail view will now the new attribute usable_size and hide
the redundant information steps, ruleset, type and rule_name.
Fixes: https://tracker.ceph.com/issues/44620 Signed-off-by: Stephan Müller <smueller@suse.com>
Stephan Müller [Wed, 6 May 2020 08:13:28 +0000 (10:13 +0200)]
mgr/dashboard: Crush selection can handle any crush map
The crush selection class now provides a view static methods that can
take a crush map in search for sub nodes of a node and provide a list of
failure domains generated out of a list of nodes.
Fixes: https://tracker.ceph.com/issues/44620 Signed-off-by: Stephan Müller <smueller@suse.com>
Kefu Chai [Sun, 31 May 2020 00:47:34 +0000 (08:47 +0800)]
qa/suites/rgw/tempest: update unsupported tests of tempest
after rerunning tempest with lastest radosgw, remove the supported
tests from the blacklist, and add the ones which are not supported
yet. now we can pass 123 tests in total.
also enable discoverity for better testing coverage, since it's
supported now.
Kefu Chai [Sun, 31 May 2020 00:38:00 +0000 (08:38 +0800)]
qa/tasks/keystone: use "keystone-manage bootstrap"
* qa/tasks/keystone.py:
instead of prefilling keystone manually, use "keystone-manage bootstrap"
instead. it helps to setup the admin user, a "Default" domain with
"default" id, and wire them up with the expected role and a "admin" project,
etc. as id of the admin domain is known to be "default", we can just use it
in our tests without querying openstack for the id of "Default"
domain. this is very handy.
* qa/suites/rgw/tempest/tasks/rgw_tempest.yaml:
use "Default" for domain name. as "Default" is the name of the domain
created by bootstrap, while "default" is its id.
* qa/suites/rgw/crypt/2-kms/barbican.yaml:
remove settings to bootstrap keystone
Kefu Chai [Thu, 28 May 2020 16:51:39 +0000 (00:51 +0800)]
qa/suites/rgw/tempest: use the latest tempest supporting py3.5
in case we need to use ubuntu xenial for testing, xenial only had python
3.5 packaged. and tempest 23.0 was the last version which supports
python3.5 and python2.7.
also do not replace link in tox.ini, as it is reachable.
to address the issues of
- pallets/markupsafe#116
- pypa/setuptools#2017
MarkupSafe is installed by
https://opendev.org/openstack/requirements/raw/branch/stable/pike/upper-constraints.txt
Kefu Chai [Mon, 25 May 2020 07:52:04 +0000 (15:52 +0800)]
qa/suites/rgw/tempest: bump up keystone to 17.0.0
* also generate a sample conf file following the document at
https://github.com/openstack/keystone/tree/17.0.0.0rc2/etc
* use "projects" instead of "tenants" to match the terminology used by
openstack identify API 3.0.
* test API 3.0 instead of API 2.0, by changing
`rgw_keystone_api_version` from "2" to "3"
* explicitly specify a domain "default" for project to be created,
otherwise a POST request will fail with:
```
{"error":{"code":400,"message":"You have tried to create a resource using the admin token. As this token is not within a domain you must explicitly include a domain for this resource to belong
to.","title":"Bad Request"}}
````
* create "default" domain, and use it, othewise a GET request fails
like:
```
2020-05-28T11:17:28.751 INFO:teuthology.orchestra.run.smithi092.stderr:http://smithi092.front.sepia.ceph.com:35357 "GET /v3/domains/default HTTP/1.1" 404 87
2020-05-28T11:17:28.752 INFO:teuthology.orchestra.run.smithi092.stderr:RESP: [404] Content-Length: 87 Content-Type: application/json Date: Thu, 28 May 2020 11:17:28 GMT Server: WSGIServer/0.2
CPython/3.6.9 Vary: X-Auth-Token x-openstack-request-id: req-bc33796f-2bc3-411c-a7fb-1208918e0dbd
2020-05-28T11:17:28.752 INFO:teuthology.orchestra.run.smithi092.stderr:RESP BODY: {"error":{"code":404,"message":"Could not find domain: default.","title":"Not Found"}}
```
* add user to "default" domain when creating it.
* use "type" as the positional argument, per
https://docs.openstack.org/keystone/pike/admin/cli-keystone-manage-services.html
otherwise we will have failures like:
```
2020-05-28T13:38:24.867 INFO:teuthology.orchestra.run.smithi198.stderr:openstack service create: error: unrecognized arguments: --type keystone
```
* update `create_endpoint()` to use the V3 API,
see
https://docs.openstack.org/python-openstackclient/pike/cli/command-objects/endpoint.html
Kefu Chai [Thu, 28 May 2020 15:14:35 +0000 (23:14 +0800)]
qa/tasks/keystone.py: support multiple positional args
it's required when creating endpoint, see
https://docs.openstack.org/python-openstackclient/pike/cli/command-objects/endpoint.html,
where we need to pass <service>, <interface>, and <url>