cephadm: fix error handling in `command_check_host()`
`find_program()` raises `ValueError` when the executable hasn't been
found. It means we need to catch `ValueError` exception in
`command_check_host()` and raise `Error` instead of `RuntimeError` since
only `Error` is caught at the end.
Typical failure:
```
INFO:cephadm:/usr/bin/ceph:stderr Error ENOENT: New host mon1 failed check: ['INFO:cephadm:podman|docker (/bin/podman) is present', 'INFO:cephadm:systemctl is present', 'Traceback (most recent call last):', ' File "<stdin>", line 2820, in <module>', ' File "<stdin>", line 2434, in command_check_host', ' File "<stdin>", line 796, in find_program', 'ValueError: lvcreate not found']
```
When playing with cephadm, at multiple times, I've reached the max
number of attempt in `is_available()`
Increasing the `retry_max` helps to avoid failure like following:
```
INFO:cephadm:mgr not available, waiting (1/5)...
INFO:cephadm:mgr not available, waiting (2/5)...
INFO:cephadm:mgr not available, waiting (3/5)...
INFO:cephadm:mgr not available, waiting (4/5)...
INFO:cephadm:mgr not available, waiting (5/5)...
ERROR: mgr not available after 5 tries
* refs/pull/31633/head:
cephfs-shell: Instead of assert use stat for tests in rmdir
cephfs-shell: Add function for common rmdir test code
cephfs-shell: Add rmdir test for non empty directory
cephfs-shell: Add rmdir -p test for non empty directory
cephfs-shell: Add rmdir -p test for non existing dir
cephfs-shell: Add rmdir -p test to delete all dirs in given path
cephfs-shell: Add rmdir -p test for root directory with empty directories
cephfs-shell: Add rmdir test for valid file
cephfs-shell: Add rmdir test for invalid directory
cephfs-shell: Add rmdir test for valid directory
cephfs-shell: Fix rmdir '-p' issues
Reviewed-by: Rishabh Dave <ridave@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Sage Weil [Fri, 31 Jan 2020 23:31:31 +0000 (17:31 -0600)]
Merge PR #32806 into master
* refs/pull/32806/head:
common/bl: fix accessibility of bptr's _off and _len fields.
common/bl: drop get_raw() from the public buffer::ptr interface.
common: drop sharing of buffer::raw outside bufferlist.
Venky Shankar [Wed, 4 Dec 2019 04:49:12 +0000 (23:49 -0500)]
mgr/volumes: purge thread uses new async interface
This also makes `_cancel_jobs()` thread safe, which was not the
case earlier (with `_cancel_purge_job()`) -- this also makes the
code simpler by sharing the lock betweent two condition variables.
The seastar submodule's .gitmodules links to `../dpdk` which is no longer present after removing dpdk from ceph.git's .gitmodules.
```
<dwfreed> the ceph/seastar repo uses awful URLs for the submodules
<dwfreed> and those awful URLs are the real reason it's failing
<dwfreed> dgalloway: ^^^
<dwfreed> seastar's .gitmodules references repos in the parent directory, so that when it's checked out as a submodule of ceph, you don't download the repos twice (and git will probably also use references instead of duplicating the local .git); however, ceph doesn't have a submodule for dpdk anymore
<dwfreed> so seastar's referencing a dpdk repo that doesn't exist
<dgalloway> i think i follow. so you're suggesting revert https://github.com/ceph/ceph/commit/cb8087dfac31b8490fefdfca28d389b7b9901ef8 ?
<dwfreed> yep
<dwfreed> that'd be one way to fix it
...
<joshd> dgalloway: I'd suggest revert for now, and let the crimson folks figure out the longer term fix when they're back
```
Signed-off-by: David Galloway <dgallowa@redhat.com>
Sage Weil [Thu, 30 Jan 2020 16:22:49 +0000 (10:22 -0600)]
qa/tasks/ceph: only re-request scrub on unscrubbed pgs
If we haven't scrubbed everything, we occasinoally re-request scrub in case
the request was missed by the OSD (this can happen). But we were
re-requesting scrub on ALL pgs, and if they are done in a
semi-deterministic order and are slow, then we may never get to the final
ones.
Sage Weil [Thu, 30 Jan 2020 15:28:38 +0000 (09:28 -0600)]
Merge PR #32878 into master
* refs/pull/32878/head:
cephadm: share code between 'pull' and 'inspect-image'
mgr/cephadm: upgrade: pull image after upgrade start, and for each host
cephadm: add inspect-image command
Patrick Donnelly [Thu, 30 Jan 2020 15:06:21 +0000 (07:06 -0800)]
Merge PR #32397 into master
* refs/pull/32397/head:
mds: Move StrayManager initializations to its header
mds: Remove extra spaces in StrayManager header.
mds: Reorganize structure members in StrayManager header
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Sage Weil [Thu, 30 Jan 2020 14:32:50 +0000 (08:32 -0600)]
qa/tasks/ceph_manager: make fix_pgp_num behave when no pool is found
Fixes:
2020-01-30T04:41:24.697 INFO:tasks.thrashosds.thrasher:fixing pg num pool None
2020-01-30T04:41:24.698 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-sage-testing-2020-01-29-1034/qa/tasks/ceph_manager.py", line 1070, in wrapper
return func(self)
File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-sage-testing-2020-01-29-1034/qa/tasks/ceph_manager.py", line 1200, in _do_thrash
self.choose_action()()
File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-sage-testing-2020-01-29-1034/qa/tasks/ceph_manager.py", line 768, in fix_pgp_num
if self.ceph_manager.set_pool_pgpnum(pool, force):
File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-sage-testing-2020-01-29-1034/qa/tasks/ceph_manager.py", line 2088, in set_pool_pgpnum
assert isinstance(pool_name, six.string_types)
AssertionError
Lenz Grimmer [Thu, 30 Jan 2020 13:43:46 +0000 (13:43 +0000)]
mgr/dashboard: Change project name to "Ceph Dashboard" (#32959)
mgr/dashboard: Change project name to "Ceph Dashboard"
Reviewed-by: Laura Paduano <lpaduano@suse.com> Reviewed-by: Patrick Seidensal <pnawracay@suse.com> Reviewed-by: Stephan Müller <smueller@suse.com> Reviewed-by: Tatjana Dehler <tdehler@suse.com> Reviewed-by: Volker Theile <vtheile@suse.com>
Sage Weil [Thu, 30 Jan 2020 13:01:47 +0000 (07:01 -0600)]
Merge PR #32972 into master
* refs/pull/32972/head:
python-common/ceph/deployment/translate: use 'prepare' instead of 'batch' for trivial case
qa/tasks/cephadm: pass short dev name to osd prepare
mgr/cephadm: fix detection of just-created OSDs
mgr/cephadm: properly indent raise conditions
mgr/cephadm: add warning to other orchestrators
mgr/cephadm: separate acceptance criterias for Devices
mgr/cephadm: fix typos
mgr/cephadm: move utils in test/utils.py
mgr/ssh: increase disk size to 20G
drivegroups: add support for drivegroups + tests
mgr/orch_cli: allow multiple drivegroups
drivegroups: translate disk spec to ceph-volume call
Reviewed-by: Jan Fajerski <jfajerski@suse.com> Reviewed-by: Joshua Schmid <jschmid@suse.de>
Greg Farnum [Thu, 30 Jan 2020 12:43:13 +0000 (04:43 -0800)]
mon: elector: return after triggering a new election
When receiving an old propose, we were correctly triggering a new election
but not then returning out of receive_propose(), so we processed the
"should I defer" logic and perhaps sent out a deferal (in the current epoch!).