Xuehan Xu [Wed, 18 Mar 2020 02:32:02 +0000 (10:32 +0800)]
crimson/os/heartbeat: make Heartbeat::send_failures() safe
Currently, Heartbeat::send_failures() invokes monc.send_message() in a
continuation which may be run asynchronously, risking involving a daggling
"monc" reference when OSD shuts down and MonClient is destroyed.
Sage Weil [Wed, 11 Mar 2020 22:38:59 +0000 (17:38 -0500)]
Merge PR #33885 into master
* refs/pull/33885/head:
Merge pull request #33848 from mchangir/octopus-tests-remove-suprious-whitespace
Merge PR #33746 into octopus
Merge PR #33830 into octopus
Merge PR #33732 into octopus
Merge PR #33620 into octopus
Merge pull request #33876 from tchaikov/octopus-cephadm-mypy
cephadm: add "assert foo is not None" for mypy check
Merge pull request #33067 from tspmelo/wip-rbd-delete-with-snapshot
cephadm: add grafana adopt
Merge PR #33771 into octopus
Merge PR #33850 into octopus
Merge PR #33853 into octopus
Merge PR #33857 into octopus
Merge PR #32990 into octopus
Merge PR #33713 into octopus
Merge PR #33838 into octopus
qa/tasks/cephadm: no default mon|mgr|crash service specs
qa/suites/rados/cephadm/upgrade: upgrade start point that supports the no-spec option
Merge PR #33832 into octopus
cephadm: bootstrap: wait for mgr to restart after enabling a module
mgr: add 'mgr_status' tell command
Merge pull request #33839 from rhcs-dashboard/44538-fix-rgw-grafana-get-put-latencies
Merge pull request #33743 from votdev/issue_43869_fix_qa_test
cephadm: create initial mon and mgr service specs too
cephadm: no need to pregenerate a crash key for the bootstrap host
mgr/cephadm: do not complain when we don't have enough hosts
mgr/cephadm: remove orphan daemons
mgr/cephadm: report size=0 for fabricated ServiceDescription
mgr/cephadm: safety check to prevent removing all mon|mgr daemons
mgr/cephadm: prevent scaling mon|mgr below count=1
mgr/cephadm: do not remove daemons from remove_service
Merge pull request #33805 from tchaikov/wip-44500
spec: Podman (temporarily) requires apparmor-abstractions on suse
mgr/cephadm: Make sure we don't co-locate the same daemon
monitoring: fix RGW grafana chart 'Average GET/PUT Latencies'
tests: remove spurious whitespace
mgr/cephadm: fix service list filtering
Merge PR #33825 into octopus
Merge PR #33811 into octopus
Revert "Merge pull request #33673 from cbodley/wip-denc-enum"
mgr/cephadm: fix upgrade order
Merge PR #33801 into octopus
Merge PR #33822 into octopus
cephadm: bootstrap: tolerate error return from -h
Merge PR #33809 into octopus
Merge PR #32678 into octopus
cephadm: use `sh` instead of `bash` during enter
ceph.in: only shut down rados on clean exit
common/ceph_timer: Pass reference to waited time on stack
common/ceph_timer: Add test
common/ceph_timer: Use unique_function, allowing noncopyable events
common/ceph_timer: Couple cleanups
common/ceph_timer: Fix namespaces
common/ceph_timer: Add missing includes
common/ceph_timer.h: Don't indent contents of a namespace
mgr/dashboard: Crush rule modal
mgr/dashboard: Preserve rule selection on pool type change
mgr/dashboard: Crush rule is only send during replicated pool creation
mgr/dashboard: Explicit returns in pool form
mgr/dashboard: Removes fork join in pool form
mgr/dashboard: Hide ECP actions during ec pool edit
mgr/dashboard: Pool form erasure/replicated boolean
mgr/dashboard: Change pool info API endpoint
mgr/dashboard: Moves ECP info endpoint to UI-API
mgr/cephadm: add _remove_osds_bg back to main loop
mgr/cephadm/osd: update removal report immediately
qa/tasks/ceph_manager: use StringIO for capturing COT output
qa/standalone/scrub/osd-scrub-repair: force osdmap prop to osds
qa/standalone/scrub/osd-scrub-test: wait longer for update
qa/tasks/ceph_manager: capture stderr for COT
qa/suites/rados/ceph: drop opensuse for now
mon/MonClient: send logs to mon on separate schedule than pings
mgr/dashboard: Fix missing ImageSpec usage
mgr/dashboard: Allow removing RBD with snapshots
mgr/dashboard: Refactor and cleanup tasks.mgr.dashboard.test_user
mgr/dashboard: support multiple DriveGroups when creating OSDs
mon/MonClient: send logs to mon even if we have no keelalive2
cephadm: flag dashboard user to change password
anurag [Wed, 11 Mar 2020 14:17:05 +0000 (19:47 +0530)]
mgr/dashboard: Pool read/write OPS shows too many decimal places Fixes: https://tracker.ceph.com/issues/39714 Signed-off-by: anurag <anurag@localhost.localdomain>
Sage Weil [Wed, 11 Mar 2020 13:55:51 +0000 (08:55 -0500)]
Merge PR #33830 into octopus
* refs/pull/33830/head:
qa/tasks/cephadm: no default mon|mgr|crash service specs
qa/suites/rados/cephadm/upgrade: upgrade start point that supports the no-spec option
cephadm: create initial mon and mgr service specs too
cephadm: no need to pregenerate a crash key for the bootstrap host
mgr/cephadm: do not complain when we don't have enough hosts
mgr/cephadm: remove orphan daemons
mgr/cephadm: report size=0 for fabricated ServiceDescription
mgr/cephadm: safety check to prevent removing all mon|mgr daemons
mgr/cephadm: prevent scaling mon|mgr below count=1
mgr/cephadm: do not remove daemons from remove_service
Sage Weil [Wed, 11 Mar 2020 12:12:11 +0000 (07:12 -0500)]
Merge PR #33620 into octopus
* refs/pull/33620/head:
mgr/dashboard: Crush rule modal
mgr/dashboard: Preserve rule selection on pool type change
mgr/dashboard: Crush rule is only send during replicated pool creation
mgr/dashboard: Explicit returns in pool form
mgr/dashboard: Removes fork join in pool form
mgr/dashboard: Hide ECP actions during ec pool edit
mgr/dashboard: Pool form erasure/replicated boolean
mgr/dashboard: Change pool info API endpoint
mgr/dashboard: Moves ECP info endpoint to UI-API
Kefu Chai [Wed, 11 Mar 2020 08:08:51 +0000 (16:08 +0800)]
cephadm: add "assert foo is not None" for mypy check
it's legit to pass file objects to fcntl(), but `Popen.stdout` and
`Popen.stderr` properies are not necessarily file objects -- they could be None.
this cannot be deduced at compile-time. even we can ensure this,
as we do pass `subprocess.PIPE` to the constructor. so mypy just
complains at seeing this:
```
cephadm:429: error: Argument 1 to "fcntl" has incompatible type "Optional[IO[Any]]"; expected "Union[int, HasFileno]"
cephadm:430: error: Argument 1 to "fcntl" has incompatible type "Optional[IO[Any]]"; expected "Union[int, HasFileno]"
cephadm:431: error: Argument 1 to "fcntl" has incompatible type "Optional[IO[Any]]"; expected "Union[int, HasFileno]"
cephadm:432: error: Argument 1 to "fcntl" has incompatible type "Optional[IO[Any]]"; expected "Union[int, HasFileno]"
cephadm:455: error: Item "None" of "Optional[IO[Any]]" has no attribute "fileno"
cephadm:465: error: Item "None" of "Optional[IO[Any]]" has no attribute "fileno"
cephadm:475: error: Item "None" of "Optional[IO[Any]]" has no attribute "fileno"
```
to silence this warning, insert `assert process.stdout is not None`
before accessing `process.stdout` to appease the strict optional
checking of mypy.
Sage Weil [Tue, 10 Mar 2020 22:20:48 +0000 (17:20 -0500)]
Merge PR #33771 into octopus
* refs/pull/33771/head:
common/ceph_timer: Pass reference to waited time on stack
common/ceph_timer: Add test
common/ceph_timer: Use unique_function, allowing noncopyable events
common/ceph_timer: Couple cleanups
common/ceph_timer: Fix namespaces
common/ceph_timer: Add missing includes
common/ceph_timer.h: Don't indent contents of a namespace
Nizamudeen [Tue, 10 Mar 2020 16:32:41 +0000 (22:02 +0530)]
mgr/dashboard: NoRebalance flag is added to the Dashboard
This commit will add a norebalance flag into the Cluster-wide Flags in the OSDs which can be set/unset. Fixes: https://tracker.ceph.com/issues/44543 Signed-off-by: Nizamudeen <nia@redhat.com>
Sage Weil [Tue, 10 Mar 2020 14:28:57 +0000 (09:28 -0500)]
cephadm: bootstrap: wait for mgr to restart after enabling a module
It was possible to enable a module (mon updates mgrmap) and then
do a mgr command and have that command reach the mgr before it got the
latest mgrmap and restarted.
Fixes: https://tracker.ceph.com/issues/44531 Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Mon, 9 Mar 2020 18:39:04 +0000 (13:39 -0500)]
mgr/cephadm: do not complain when we don't have enough hosts
This gets rid of INFO level log events like
2020-03-09T13:37:20.980993-0500 mgr.x [WRN] Failed to apply mds.foo spec ServiceSpec({'placement': PlacementSpec(count:2), 'service_type': 'mds', 'service_id': 'foo'}): List of host candidates is empty
Sage Weil [Mon, 9 Mar 2020 01:38:59 +0000 (20:38 -0500)]
mgr/cephadm: fix upgrade order
Create two variables, CEPH_TYPES and CEPH_UPGRADE_ORDER. In reality they
are both the same, but this way the meaning is clear, and they lists
won't get out of sync (they should always have the same elements).
Deepika Upadhyay [Wed, 12 Feb 2020 14:38:29 +0000 (20:08 +0530)]
mon/OSDMonitor: add flag `--yes-i-really-mean-it` for setting pool size 1
Adds option `mon_allow_pool_size_one` which will be disabled by default
to ensure pools are not configured without replicas.
If the user still wants to use pool size 1, they will have to change the
value of `mon_allow_pool_size_one` to true and then have to pass flag
`--yes-i-really-mean-it` to cli command:
Example:
`ceph osd pool test set size 1 --yes-i-really-mean-it`
Sage Weil [Mon, 9 Mar 2020 17:26:06 +0000 (12:26 -0500)]
ceph.in: only shut down rados on clean exit
If we exit due to a timeout, then calling rados shutdown can lead to all
sorts of problems, because we may still have another thread that is
trying to call rados_connect and/or do some work, and rados_connect
and rados_shutdown don't (and can't!) really behave well when racing
against each other.
Note that shutdown here isn't that important--the process is about to
exit anyway. It's only useful to exercise the shutdown code path more
often.
Fixes: https://tracker.ceph.com/issues/44526 Signed-off-by: Sage Weil <sage@redhat.com>
Adam C. Emerson [Fri, 6 Mar 2020 03:14:47 +0000 (22:14 -0500)]
common/ceph_timer: Pass reference to waited time on stack
std::condition_variable::wait_until takes a const reference to a
time_point. It may access this reference after relinquishing the
mutex, creating a potential use-after-free error if the first event is
shut down.
So, just copy the time onto the stack, so we have a reference that
won't disappear.
https://tracker.ceph.com/issues/44373
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Sage Weil [Mon, 9 Mar 2020 13:28:57 +0000 (08:28 -0500)]
Merge PR #33793 into master
* refs/pull/33793/head:
qa/suites/rados/cephadm/upgrade: new start point
qa/tasks/cephadm: put bootstrap config etc directly in /etc/ceph
cephadm: shell: default to config and keyring in /etc/ceph, if present
Sage Weil [Mon, 9 Mar 2020 13:28:37 +0000 (08:28 -0500)]
Merge PR #33808 into master
* refs/pull/33808/head:
mgr/cephadm: apply: fill in default placement if none is provided
mgr/cephadm: make placement truly optional (default to count=1)
mgr/cephadm: allow count == 0
mgr/cephadm: remove magic labels
Stephan Müller [Tue, 3 Mar 2020 14:39:32 +0000 (15:39 +0100)]
mgr/dashboard: Preserve rule selection on pool type change
Now if the pool type is changed from replicated to erasure in the pool
form and you have multiple rules, your selection is preserved and not
overwritten by null, which caused an error message to be shown
(crush rule is required).
Fixes: https://tracker.ceph.com/issues/44371 Signed-off-by: Stephan Müller <smueller@suse.com>
Stephan Müller [Mon, 2 Mar 2020 11:26:48 +0000 (12:26 +0100)]
mgr/dashboard: Crush rule is only send during replicated pool creation
The problem was that the crush rule setting was send during the creation
of a pool - the type didn't matter, but the setting is only used if a
replicated pool is created. This hasn't created any problems yet, but to
prevent that it's now omitted for erasure code pool creations.
Fixes: https://tracker.ceph.com/issues/44371 Signed-off-by: Stephan Müller <smueller@suse.com>
Stephan Müller [Mon, 2 Mar 2020 10:48:17 +0000 (11:48 +0100)]
mgr/dashboard: Hide ECP actions during ec pool edit
Hides erasure profile actions during erasure pool edit, as all
actions are disabled anyway as they can't be used in edit mode.
This commit makes also sure that the used crush rule will be shown
during edit of an erasure pool and no crush rule selection is shown
during creation of an erasure code pool, as in most cases a new crush
rule will be created for the ec pool.
Fixes: https://tracker.ceph.com/issues/44371 Signed-off-by: Stephan Müller <smueller@suse.com>
Stephan Müller [Mon, 2 Mar 2020 10:39:48 +0000 (11:39 +0100)]
mgr/dashboard: Pool form erasure/replicated boolean
Now a boolean will be set if the pool type is changed. The question
which type is currently set is raised a lot of times inside the pool
form therefor there is a speed advance if it's just a boolean instead of
getting the form control from the form in the first place and than
compare two strings.
Fixes: https://tracker.ceph.com/issues/44371 Signed-off-by: Stephan Müller <smueller@suse.com>
Stephan Müller [Mon, 2 Mar 2020 10:08:23 +0000 (11:08 +0100)]
mgr/dashboard: Change pool info API endpoint
Moves the "_info" endpoint of pool into an equivalent
UI-API call with the name "info".
Added three more attributes to the info dict which enables the dashboard
to only call info to get all the needed data, currently three calls will
be used to do that.
Removed pool_name parameter as the outcome was not used.
Updated the tests and related angular files accordingly.
Fixes: https://tracker.ceph.com/issues/44371 Signed-off-by: Stephan Müller <smueller@suse.com>