Sage Weil [Wed, 19 Feb 2020 12:33:59 +0000 (06:33 -0600)]
Merge PR #33249 into master
* refs/pull/33249/head:
mgr/cephadm: move cutoff calc inside helper
mgr/orch: clean up service_action, remove_service
mgr/cephadm: move into DaemonCache class
mgr/cephadm: fix tests
mgr/cephadm: implete hacky 'refresh=True' path
mgr/cephadm: raise health alert when scrape fails
mgr/cephadm: persist cached daemon state
mgr/orch: serialize DaemonDescription last_refresh
mgr/cephadm: move _get_daemons() impl into list_daemons
mgr/cephadm: replace remaining _get_daemons() with daemon cache
mgr/cephadm: use daemon map for service|daemon removal
mgr/cephadm: avoid _get_daemons for service|daemon actions
mgr/cephadm: replace daemon_cache with an explicit set of dicts
mgr/cephadm: move DaemonDescription construction into helper
Sage Weil [Mon, 17 Feb 2020 13:50:50 +0000 (07:50 -0600)]
mgr/orch: clean up service_action, remove_service
Standardize on service_name argument that looks like 'mgr', 'mds.fsname',
'mds', or some other daemon name prefix. Avoid the ambiguously-named
service_type+service_name combinations.
Sage Weil [Fri, 14 Feb 2020 18:07:13 +0000 (12:07 -0600)]
mgr/cephadm: persist cached daemon state
- load cached state on startup
- persist state after a scrape only
- scrape everything after startup
Note that we modify our in-memory cache when we add or remove a service
and then immediately trigger a scrape, but we do not invalidate the
persisted state, since it's simpler (and presumably a good idea) to
simply re-scrape everything after a mgr restart.
Volker Theile [Tue, 4 Feb 2020 14:04:08 +0000 (15:04 +0100)]
mgr/dashboard: RGW port autodetection does not support "Beast" RGW frontend
* Improve regular expressions to support more configuration variations.
* Modify error message. It includes the config line to be parsed. This should help to debug errors much easier.
* If there are multiple (ssl_)ports/(ssl_)endpoints options, then the first found option will be returned.
Sebastian Wagner [Wed, 12 Feb 2020 15:21:05 +0000 (16:21 +0100)]
mgr/orchestrator: Use CLICommand, except it's global variable
`CLICommand.COMMANDS` is a global varialbe that prevents
anyone from importing other modules, as the `COMMANS` are then
merged together. Let's use a meta class instead of a global variable.
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Yingxin Cheng [Mon, 10 Feb 2020 09:00:31 +0000 (17:00 +0800)]
crimson/net: remove duplicated error codes and conditions
The duplicated error codes and conditions were originally introduced to
match connection errors with both system category (thrown by seastar)
and generic category (thrown by standard library). Since error_code
with system category can be matched by error_condition with generic
category (see std::errc and
system_error_category::default_error_condition(int)), our duplicated
counterparts are not needed actually.
Matthew Oliver [Fri, 10 Jan 2020 03:17:11 +0000 (03:17 +0000)]
rgw: make radosgw-admin user create and modify distinct
Currently if you run 'radosgw-admin user create ..' when the user
already exists and you happen to specify, at least, '--uid' and
'--display-name' that match the existing user, radowgw-admin will
actaully go modify the existing user.
This behaviour is a little confusing, hence the bug this patch is
fixing. This patch instead simplifies the tool to make
'create' create and 'modify' modify.
Meaning when you go 'create' a user that already exists, you'll get an
error, as expected. If you want to modify a user, you actually have to
use 'modify'.
For exapmle, now:
$ radosgw-admin user create --uid="test-user" --display-name="test user"
could not create user: unable to create user, user: test-user exists
Signed-off-by: Matthew Oliver <moliver@suse.com> Fixes: https://tracker.ceph.com/issues/38619
Kefu Chai [Sun, 16 Feb 2020 11:05:09 +0000 (19:05 +0800)]
crimson/admin: no need to check for '\n'
as we don't need to mimic the behavior of classic OSD, what we need to
to fulfill the needs of ceph cli. see `admin_socket()` in
`src/pybind/ceph_daemon.py`, which sends a `\0` to indicate the end of a
command.
Kefu Chai [Sun, 16 Feb 2020 10:03:24 +0000 (18:03 +0800)]
crimson/asok: disconnect client when shutdown
track the established connection as well, please note, the current asok
implementation only allows a single connection at the same time, even
though unix domain socket allows multiple concurrent clients. so there
is no need to track multiple clients at this moment.
Kefu Chai [Sun, 16 Feb 2020 08:40:04 +0000 (16:40 +0800)]
crimson/asok: do not assume the order of param eval
* do not assume the order of parameter evaluation, before this change,
we have `do_with(cn.input(), cn.output(), std::move(cn) ...)`, see
https://en.cppreference.com/w/cpp/language/eval_order,
> side effects of the initialization of every parameter are
> indeterminately sequenced with respect to value computations and side
> effects of any other parameter.
we cannot move `cn` out and then call its member functions. so
introduce a struct for capturing its input and output.
* move `do_until_gate()` into `start()`, no need to check if
gate is stopped in `safe_action`, as `sestar::do_until()` will do
this for us.
Kefu Chai [Sun, 16 Feb 2020 02:03:36 +0000 (10:03 +0800)]
crimson: refactor asok command
* do not define another iterator type, use `map::const_iterator`
directly
* do not register hooks/commands with server block, register them
one by one, much simpler this way.
* encapsulate the hook metadata in `AdminSocketHook`, so each
`AdminSocketHook` instance is self-contained in the sense that
we don't need to use an extra type for keeping track of them.
Sage Weil [Sat, 15 Feb 2020 17:40:08 +0000 (11:40 -0600)]
cephadm: separate out require files in config-json
- Put files in a subsection of the config-json.
- Also, consolidate the sanity checks into one place (command_deploy)
instead of duplicating them in create_daemon_dirs.
Paul Cuzner [Wed, 29 Jan 2020 03:10:37 +0000 (16:10 +1300)]
cephadm: add alertmanager deployment feature
Deploy now accepts a daemon_type of alertmanager. Since alertmanager
is a cluster aware service, the monitoring metadata has been updated to
allow a daemon to use multiple ports. In addition, when config_json is
received, any 'key' prefixed by '_' is skipped when creating files in the
daemons etc directory. Keys that use the '_' prefix hold config data that
can be used elsewhere. In the case of the alertmanager a _peers parameter
is required which is used to add --cluster.peer=<ip>:<port> to the
container command to form the alertmanager cluster