Kefu Chai [Fri, 31 Jan 2020 05:52:49 +0000 (13:52 +0800)]
qa/tasks: drop test_cephadm_orchestrator.py
this test will end with a failure like
```
2020-01-30T18:15:15.870 INFO:tasks.ceph.mgr.x.smithi042.stderr:Warning: Permanently added 'smithi042.front.sepia.ceph.com,172.21.15.42' (ECDSA) to the list of known hosts.
2020-01-30T18:15:15.925 INFO:tasks.ceph.mgr.x.smithi042.stderr:Permission denied, please try again.
2020-01-30T18:15:15.932 INFO:tasks.ceph.mgr.x.smithi042.stderr:Permission denied, please try again.
2020-01-30T18:15:15.939 INFO:tasks.ceph.mgr.x.smithi042.stderr:root@smithi042.front.sepia.ceph.com: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
```
because mgr is not able to establish an ssh connection to that host with "root".
please note, the teuthology worker is acting using the "ubuntu" account on the
test node, and by default, "root" does not have its pubkey. and actually
`qa/tasks/cephadm.py` does push the pubkey to all the managed hosts before
testing cephadm.
since `qa/tasks/cephadm.py` is a better test for cephadm, let's just
drop this one.
Sage Weil [Wed, 5 Feb 2020 22:07:43 +0000 (16:07 -0600)]
Merge PR #33058 into master
* refs/pull/33058/head:
mgr/cephadm: enforce that a host is a valid DNS name
mgr/cephadm: verify host's hostname matches our host name
cephadm: check-host: add optional --expect-hostname
Reviewed-by: Michael Fritch <mfritch@suse.com> Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
When using vg/lv, this function throws an error like following:
```
stderr: unable to read label for test_group/data-lv2: (2) No such file or directory
stderr: 2020-02-04T21:03:32.153+0000 7fe091af4200 -1 bluestore(test_group/data-lv2) _read_bdev_label failed to open test_group/data-lv2: (2) No such file or directory
```
When passing a vg/lv path for generating a single report, it fails
because the filter used in the `lvs` command isn't right. It uses the lv
name instead of the vg name because `os.path.basename(device)` is used
while it should be `os.path.dirname(device)`
Kiefer Chang [Tue, 4 Feb 2020 06:27:17 +0000 (14:27 +0800)]
qa/tasks/mgr/test_orchestrator_cli: support multiple DriveGroups
create_osds interface in Orchestrator supports multiple named DriveGroups
since https://github.com/ceph/ceph/pull/32972. Adapt the changes in
the test.
Sage Weil [Tue, 4 Feb 2020 03:28:40 +0000 (21:28 -0600)]
Merge PR #33020 into master
* refs/pull/33020/head:
osdc/Objecter: inline pool full check
osdc/Objecter: remove duplicated pause check code
osdc/Objecter: only pause if respects_full()
osdc/Objecter: move respects_full() to op_target_t
Sage Weil [Tue, 4 Feb 2020 03:28:19 +0000 (21:28 -0600)]
Merge PR #32831 into master
* refs/pull/32831/head:
common, include: drop the copy{_in} from bufferlist entirely.
os/bluestore: switch copy_in() users to bufferlist::iterator.
osdc: switch users of bufferlist::copy{_in} to iterators.
osd: switch users of bufferlist::copy{_in} to iterators.
rgw: switch copy{_in} users to bufferlist::iterator.
ec: switch users of bufferlist::copy{_in} to iterators.
cls/queue: switch users of bufferlist::copy{_in} to iterators.
client: switch users of bufferlist::copy{_in} to iterators.
*: switch trivial users of bufferlist::copy{_in} to iterators.
test/bl: switch copy{_in} users to bufferlist::iterator.
common, include: kill the bl::last_p member.
common: encode for std::list<T> doesn't use bl::copy_in() anymore.
Matthew Oliver [Tue, 4 Feb 2020 02:29:48 +0000 (13:29 +1100)]
ceph_argparse: increment matchcnt on kwargs
Currently when you pass a param in on the ceph cli as a kwarg
(--<param_name>) the matchcnt isn't incremented in the validate method
which is used to choose the right command signature.
Yaarit Hatuka [Mon, 3 Feb 2020 19:19:39 +0000 (14:19 -0500)]
mgr/telemetry: check get_metadata return val
get_metada() returns 'None' when requesting a missing service, hence
trying to access its content fails. Added a check for osd and mgr
get_metadata() calls.
cephadm: fix error handling in `command_check_host()`
`find_program()` raises `ValueError` when the executable hasn't been
found. It means we need to catch `ValueError` exception in
`command_check_host()` and raise `Error` instead of `RuntimeError` since
only `Error` is caught at the end.
Typical failure:
```
INFO:cephadm:/usr/bin/ceph:stderr Error ENOENT: New host mon1 failed check: ['INFO:cephadm:podman|docker (/bin/podman) is present', 'INFO:cephadm:systemctl is present', 'Traceback (most recent call last):', ' File "<stdin>", line 2820, in <module>', ' File "<stdin>", line 2434, in command_check_host', ' File "<stdin>", line 796, in find_program', 'ValueError: lvcreate not found']
```
This allows for evaluation of more complex use cases where IgnorePublicACLs and
the like are set which need to be evaluated for GET/HEAD requests as well
This API returns whether the Bucket Policies/ACLs are public. There are a couple
of caveats:
- AWS currently returns PolicyNotFound error in case a bucket policy doesn't
exist, though a non existant bucket policy would mean the default ACLs apply
where the bucket is private, so error return here seems like an error
- the API spec mentions TRUE and FALSE as the response IsPublic element value,
however in practice both boto/aws clients and AWS S3 return/expect a lowercase
response.
Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
Conflicts:
src/rgw/rgw_rest_s3.h
merge conflict after zipper rework, dropped a spurious newline in rgw_rest_s3.h
after get_obj_op decl.
src/rgw/rgw_common.h
src/rgw/rgw_rest_s3.cc
src/rgw/rgw_rest_s3.h:
merge conflict after bucket replication merge, trivial conflicts
When playing with cephadm, at multiple times, I've reached the max
number of attempt in `is_available()`
Increasing the `retry_max` helps to avoid failure like following:
```
INFO:cephadm:mgr not available, waiting (1/5)...
INFO:cephadm:mgr not available, waiting (2/5)...
INFO:cephadm:mgr not available, waiting (3/5)...
INFO:cephadm:mgr not available, waiting (4/5)...
INFO:cephadm:mgr not available, waiting (5/5)...
ERROR: mgr not available after 5 tries
Sage Weil [Fri, 31 Jan 2020 14:35:26 +0000 (08:35 -0600)]
mgr/cephadm: prefix daemon ids with hostname
This is friendlier to a human operator since they can immediately see
where an instance is located, as with the legacy scheme, while still
keeping the unique random suffix. Use a . to separate so that we can
set per-host options.
xie xingguo [Mon, 3 Feb 2020 13:04:05 +0000 (21:04 +0800)]
osd/OSD: prevent down osds from immediately rejoining the culster
In 114c65fc I posted a work-around to fix a heartbeat brain-split case
but it really looks to me now like I am missing some other cases where
an immediate attempt to rejoin is bad, like when the network actually
isn't working properly rather than being predictably manipulated by an
admin.
This patch instead slows the unconditionally rejoining attempt down,
especially make sure that we don't try to immediately rejoin the culster
when an osd has just been marked down by mon.
xie xingguo [Mon, 3 Feb 2020 12:09:37 +0000 (20:09 +0800)]
osd/OSD: trim osd_markdown_log in tick() thread
so we don't have to do it in multiple places. Note that
we can't do it in the tick_without_osd_lock thread instead
because we we can not access it safely without the protection
of osd_lock.