]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
John Spray [Tue, 17 Oct 2017 22:16:22 +0000 (18:16 -0400)]
mgr: drop GIL around set_uri, set_health_checks
These didn't need to keep the GIL to go and do their
pure C++ parts, and by keeping it they could deadlock
while trying to take ActiveMgrModules::lock.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
27ee148e040ebaf512f8e11f814b3a7c8cf21f8b )
John Spray [Tue, 17 Oct 2017 22:14:43 +0000 (18:14 -0400)]
mgr: fix ~MonCommandCompletion
This was doing a Py_DECREF outside of the Gil.
Fixes: http://tracker.ceph.com/issues/21593
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
58dfa97ba88882fb3540d15e31bcac48a1aef5ef )
John Spray [Mon, 16 Oct 2017 14:51:34 +0000 (10:51 -0400)]
mgr: update for SafeThreadState
A bunch of the previous commits were done
before this class existed, so updating in
one go instead of trying to edit history
in fine detail.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
29193a47e6cf8297d9b1ceecc7695f2c85434999 )
John Spray [Fri, 13 Oct 2017 15:31:22 +0000 (11:31 -0400)]
mgr: refactor PyOSDMap etc implementation
Implement real python classes from the C side,
rather than exposing only module methods.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
7e61f79f5d56b568103a067d9a1eb87af997ad61 )
Sage Weil [Tue, 26 Sep 2017 22:35:29 +0000 (18:35 -0400)]
mgr/PyOSDMap: add CRUSH get_item_weight
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
eacc9021459b31e42232bb958536d594d03b07b3 )
John Spray [Mon, 16 Oct 2017 10:33:48 +0000 (06:33 -0400)]
mgr: fix py_module_registry shutdown
Was calling way too early, which did a
Py_Finalize before the modules had been
joined.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
0d5b1d70e616d7d1c2d6360375770f5c4754649d )
John Spray [Thu, 12 Oct 2017 17:14:02 +0000 (13:14 -0400)]
mgr: fix thread naming
Was passing a reference to a local stringstream into
Thread::create, not realising that it was taking a char*
reference instead of a copy. Result was garbage (or usually,
all threads having the name of the last one created)
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
bb4e71ed2ebdee1ac5e4b3eee390060e19fea0d8 )
John Spray [Fri, 6 Oct 2017 15:02:44 +0000 (11:02 -0400)]
mgr: cut down duplication between active+standby
...by using PyModuleRunner class from ActivePyModule too.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
df8797320bed7ad9f121477e35d7e3862efd89bd )
John Spray [Wed, 4 Oct 2017 17:13:25 +0000 (13:13 -0400)]
mgr: fix os._exit overrides
These would throw an exception when passed
a status code.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
e2442c1e20bf4ff12d58af500b34a18cc60d2de1 )
John Spray [Thu, 24 Aug 2017 18:07:37 +0000 (14:07 -0400)]
mon/MgrMonitor: reset services map on drop_active
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
7b629ae46599d79ca1929cfc6637b367c6bb9029 )
John Spray [Tue, 22 Aug 2017 18:47:10 +0000 (14:47 -0400)]
mgr/dashboard: implement standby mode
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
4f7007d1b0226af3f0cc33627ebf5051975657ac )
John Spray [Tue, 22 Aug 2017 15:41:26 +0000 (11:41 -0400)]
pybind/mgr: add MgrStandbyModule
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
3048e85cd712b7da77cf6ac55dd6a689d00e47e5 )
John Spray [Tue, 22 Aug 2017 18:42:11 +0000 (14:42 -0400)]
mgr: standby modules come up and run now
...they still don't have access to any config though.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
c1471c7501948004096581ee415ab4a1fa2d9379 )
John Spray [Wed, 16 Aug 2017 14:23:59 +0000 (10:23 -0400)]
mgr: enable running modules in standby mode
Modules can implement a second, separate class
that has access to very little state about the
system and can't implement commands.
They have just enough information to redirect
or forward incoming requests/traffic to the
active instance of the module on the active mgr.
This enables module authors to create modules
that end users can access via any (running) mgr node
at any time, rather than having to first work out
which mgr node is active.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
25566d1edca638bd15b3ba3326ee7e4d3e573cbb )
John Spray [Tue, 15 Aug 2017 10:53:18 +0000 (06:53 -0400)]
mgr: clean up python source file naming
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
70d45a6b93c92baf8d6a3b15765110a5384c5e60 )
John Spray [Mon, 14 Aug 2017 10:31:18 +0000 (06:31 -0400)]
mgr: refactor python module management
Separate out the *loading* of modules from
the *running* of modules.
This is a precursor to enabling modules to run
in standby mode.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
9718896c8b844db2f3c07df1d344636da4605e61 )
John Spray [Thu, 27 Jul 2017 17:49:27 +0000 (13:49 -0400)]
pybind/mgr: use set_uri hook from dashboard+restful modules
No more guessing the URL!
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
089e105dd7ec762572ac06794caa7f5543075001 )
John Spray [Thu, 27 Jul 2017 15:50:23 +0000 (11:50 -0400)]
mgr: enable python modules to advertise their service URI
Fixes: http://tracker.ceph.com/issues/17460
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
a0183a63fa791954d14c57632e184858cefe893d )
John Spray [Thu, 27 Jul 2017 15:49:45 +0000 (11:49 -0400)]
mon/MgrMonitor: store services in map and expose with command
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
c3c3e4e90ba6b09e29879b500f211d607ebabb53 )
John Spray [Thu, 27 Jul 2017 15:46:40 +0000 (11:46 -0400)]
messages: `services` in MMgrBeacon
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
236841b3b62af92ce0c4852045327fcfbc5c1651 )
John Spray [Thu, 27 Jul 2017 15:45:53 +0000 (11:45 -0400)]
mon/MgrMap: store list of services
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
3f703bd91f07b2fe43a16df0083d7b7c23803fd5 )
John Spray [Thu, 27 Jul 2017 10:31:01 +0000 (06:31 -0400)]
mgr: carry PyModules ref in MonCommandCompletion
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
e938bf9b9d27e192765c805e5f532c9dd4808b21 )
John Spray [Wed, 26 Jul 2017 16:31:13 +0000 (12:31 -0400)]
pybind: update MgrModule for ceph_state->ceph_module
& tidy up the places where ceph_state was getting
used outside of MgrModule.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
62cb512e4740f1f78f516b4f2179c1123fae1b36 )
John Spray [Wed, 26 Jul 2017 11:44:00 +0000 (07:44 -0400)]
mgr: refactor python interface
Expose a python class instead of a module,
so that we have a place to carry our reference
to our MgrPyModule* and to PyModules*, rather than
passing a handle for the former and using
a global pointer for the latter.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
563878ba217491dd0a6fbd588cd56d09e3456c14 )
John Spray [Thu, 3 Aug 2017 10:22:35 +0000 (06:22 -0400)]
mgr/dashboard: remove blue highlight on scrubbing pg states
This was kind of unnecessary, highlighting a completely normal
and healthy situation in a different colour. The blue was
also really hard to read against a grey background.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
99fa1fdf4e1be57792f50907147781d12009b32b )
John Spray [Thu, 27 Jul 2017 15:42:16 +0000 (11:42 -0400)]
mgr/dashboard: clean up fs standby list when empty
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
5e64787c0ae0ac2a365c89bf89dfea425adc17d4 )
John Spray [Wed, 30 Aug 2017 12:56:39 +0000 (13:56 +0100)]
mgr: remove old-style config opt usage
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
ec09a7abc515f802451bf7ef3d22ce8ee6c6c7b3 )
John Spray [Wed, 30 Aug 2017 11:12:40 +0000 (12:12 +0100)]
mon: remove old-style mgr config opt usage
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
6af4120d63324150ba19022c41fe4fa8a38cacbb )
John Spray [Wed, 30 Aug 2017 10:48:25 +0000 (11:48 +0100)]
common: populate manager config option metadata
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
eba4c3f2762ae40ba746091e32364c2d68e780d9 )
Kefu Chai [Thu, 13 Jul 2017 06:49:48 +0000 (14:49 +0800)]
common,mds,mgr,mon,osd: store event only if it's added
otherwise
* we will try to cancel it even it's never been added
* we will keep a dangling pointer around. which is, well,
scaring.
* static analyzer will yell at us:
Memory - illegal accesses (USE_AFTER_FREE)
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit
2449b3a5c365987746ada095fde30e3dc63ee0c7 )
John Spray [Tue, 3 Oct 2017 12:16:10 +0000 (08:16 -0400)]
mgr: safety checks on pyThreadState usage
Previously relied on the caller of Gil() to
pass new_thread=true if they would be
calling from a different thread.
Enforce this with an assertion, by wrapping
PyThreadState in a SafeThreadState class
that remembers which POSIX thread
it's meant to be used in.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
625e1b5cfb9b8a5843dfe75e97826f70a57d6ebe )
John Spray [Tue, 22 Aug 2017 15:38:25 +0000 (11:38 -0400)]
mgr: move Gil implementation into .cc
The inclusion of Python.h in the .h was awkward
for other files including Gil.h.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
23c3a075ee1a27e1b57fcb452a4d6ce53080264e )
John Spray [Wed, 26 Jul 2017 11:21:40 +0000 (07:21 -0400)]
mgr: reduce Gil verbosity at level 20
Even at 20, it's pretty heavy to be logging
every lock acquire/release.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
987612a97529be7e67b89977c4a0cf47906a5ecb )
Jan Fajerski [Wed, 11 Oct 2017 10:28:19 +0000 (12:28 +0200)]
pybind/mgr/prometheus: no ports in osd_metadata
Ports might change on a OSD restart and this would create a new metadata
metric for this osd.
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
48fec7db4b214fe8ef6a04f8cb53fb8a2fb9c2ca )
Jan Fajerski [Wed, 11 Oct 2017 08:59:33 +0000 (10:59 +0200)]
pybind/mgr/prometheus: add osd_in/out metric; make osd_weight a metric
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
e4c44c1d702ce242f2cb9a58ca7ce1c31fe0a498 )
Jan Fajerski [Wed, 11 Oct 2017 18:07:19 +0000 (20:07 +0200)]
pybind/mgr_module: move PRIO_* and PERFCOUNTER_* to MgrModule class
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
f69484debade5f4fa2bd3a0d1badc9291cc9d7b7 )
John Spray [Mon, 9 Oct 2017 11:10:22 +0000 (12:10 +0100)]
qa/mgr: fix influx/prometheus test names
This was a typo: they were swapped around.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
d96a59e74b6984b77c9f3b15f702e3bf45053590 )
John Spray [Thu, 28 Sep 2017 14:50:53 +0000 (10:50 -0400)]
doc: flesh out prometheus docs
Explain ceph_disk_occupation, importance
of instance labels and honor_labels, provide
example prometheus configuration yaml.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
5227afed5f33fa9487e1bfa3fd8ce0d82eb4a20f )
John Spray [Thu, 28 Sep 2017 14:10:14 +0000 (10:10 -0400)]
mgr/prometheus: add ceph_disk_occupation series
This is the magic series that enables consumers to
easily get the drive stats that go with their
OSD stats.
Fixes: http://tracker.ceph.com/issues/21594
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
284be75524f7125dc1409b9c05fe47b37484964e )
Benjeman Meekhof [Wed, 4 Oct 2017 14:05:17 +0000 (10:05 -0400)]
mgr/influx: Correct name of daemon stat measurement to 'ceph_daemon_stats'
Signed-off-by: Benjeman Meekhof <bmeekhof@umich.edu>
(cherry picked from commit
f9014a1c75c6a3adf414b48a707fd444e65b3024 )
Benjeman Meekhof [Tue, 3 Oct 2017 20:30:43 +0000 (16:30 -0400)]
mgr/influx: modify module database check to not require admin privileges
- existing check tried to list all DB and fails even if DB exists if user is not admin level
- still tries to create database if not found and user has privs
Signed-off-by: Benjeman Meekhof <bmeekhof@umich.edu>
(cherry picked from commit
06d7d37c7b9a8c3f4435eff04b6f4934be5e676f )
Jan Fajerski [Tue, 10 Oct 2017 06:40:31 +0000 (08:40 +0200)]
pybind/mgr/prometheus: fix metric type undef -> untyped
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
6306392492d103200b21ea91bce10a315d7c4e16 )
John Spray [Mon, 25 Sep 2017 15:14:57 +0000 (11:14 -0400)]
mgr: respect perf counter prio_adjust in MgrClient
This awkwardly involves re-ordering some definitions
in perf_counters.h in order to refer to the prio
names defined in PerfCountersBuilder.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
88163749b572ffd2bfe0850136fad5dbed2a9180 )
John Spray [Mon, 18 Sep 2017 13:06:13 +0000 (09:06 -0400)]
test: update perfcounters test for priority in output
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
0f531f7871a68db96b2fb66ffdf6fae6935e6107 )
John Spray [Wed, 13 Sep 2017 21:16:54 +0000 (17:16 -0400)]
qa: add mgr module selftest task
The module self test commands give us a chance to
catch any other ceph changes that change something
that a module was relying on reading.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
99352ceced9d0fe92ddad6b97b1393b41de75d50 )
John Spray [Wed, 13 Sep 2017 14:46:56 +0000 (10:46 -0400)]
mgr/prometheus: remove explicit counter list
These have had their priorities bumped up to
USEFUL, so they'll appear in the default
get_all_counters output.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
ad5a31efbea8081f03dd73669e891d03857ef9cc )
John Spray [Wed, 13 Sep 2017 14:45:21 +0000 (10:45 -0400)]
mon: elevate priority of many perf counters
We can be quite liberal here, because mons are
small in number. However, we don't want to expose
KV database counters at this database from OSDs, so
use the prio_adjust mechanism for that.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
ac8320f23dd4c00eb80da0d9837c29744e38bd57 )
John Spray [Wed, 13 Sep 2017 11:07:50 +0000 (07:07 -0400)]
osd: upgrade a bunch of perf counters to PRIO_USEFUL
These are broadly the OSD-wide IO stats, which happen
to also be the ones that were named in the
prometheus plugin until I changed it to be
priority-based.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
a1cc4ba2993de62b60fd1e58a9704877a6da5fe4 )
John Spray [Wed, 13 Sep 2017 11:06:24 +0000 (07:06 -0400)]
common: PerfCountersBuilder helper for priorities
Let the caller set a priority as the defaul, to enable them
to create a bunch at a given priority. This is just a
convenience.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
66f61eeda6a2465b5fc0e40a4f1300913db065dc )
John Spray [Tue, 12 Sep 2017 14:27:12 +0000 (10:27 -0400)]
mgr/prometheus: add a self-test command
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
76e1ba52b1b95d417cdd04b8fe985acee648f0e9 )
John Spray [Tue, 12 Sep 2017 12:05:28 +0000 (08:05 -0400)]
mgr/influx: remove file-based config
...and also trim down the configuration to what's really
needed. In general users don't need to pick and choose
metrics. We could add it back if there was a strong
motivation.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
6776d4645afc49a4bfb4b62673c91384239037f4 )
John Spray [Tue, 12 Sep 2017 10:51:21 +0000 (06:51 -0400)]
mgr/influx: enable self-test without dependencies
The idea of self-test commands is that they're self
contained and just exercise the module's calls
to the Ceph-side.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
125294ab9d6e99aa4c960fea147a4e86624b869e )
John Spray [Tue, 12 Sep 2017 10:18:15 +0000 (06:18 -0400)]
mgr/influx: revise perf counter handling
- Use new get_all_perf_counters path
- Consequently get counters for all daemons, not just OSD
- Tag stats with ceph_daemon rather than osd_id, as some
stats appear from more than one daemon type
- Remove summing of perf counters, external TSDB and/or queries
can do this.
- Remove mgr_id tag: this would change depending on which
mgr was active, which is certainly not desirable.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
59b48e7660f4b757804974835027cd08a59843c2 )
John Spray [Thu, 3 Aug 2017 17:00:56 +0000 (13:00 -0400)]
mgr: omit module list in beacon logging
This is useful in itself, but awkward when dealing
with logs generally, because it means that when you
grep on the name of a module, you get mostly beacon
messages rather than the log messages from the
module.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
8d1277fa5c578ce0ea23a70cc58c6cf99921ee25 )
John Spray [Tue, 12 Sep 2017 09:42:23 +0000 (05:42 -0400)]
mgr: define perf counter constants in mgr_module
So that modules can consume perf counter data
intelligently without having to hunt around
in C land for these constants and redefine them.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
39ab28ed47e869e1466cb3a316a2cb11bdedd23a )
John Spray [Mon, 11 Sep 2017 13:12:25 +0000 (09:12 -0400)]
ceph.in: use PRIO_INTERESTING as daemonperf threshold
Using PRIO_USEFUL as the threshold for what goes into
time series databases. I'm claiming that we have
more "useful" counters than fit on the screen,
so daemonperf's "a screen's worth" threshold
should be at the "interesting" level.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
30a74ce343caec2a433cb532ba697fe7013ed05c )
John Spray [Mon, 11 Sep 2017 13:12:01 +0000 (09:12 -0400)]
mon: set some priorities on perf counters
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
29a71c35c39fbe1d4887e3f5ebb93232daab3487 )
John Spray [Mon, 4 Sep 2017 09:39:11 +0000 (05:39 -0400)]
mgr/prometheus: tag stats by daemon name
Using osd=0 or similar tags was problematic because
daemons of different types have some same-named
counters (e.g. MDS and OSD both have objecter
perf counters).
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
eb524c272c89f8f99f22969b78caa016db7c671e )
John Spray [Fri, 1 Sep 2017 16:02:37 +0000 (12:02 -0400)]
mgr/prometheus: use new get_all_perf_counters interface
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
11137aa269271ad15dcf19a8d51ce6f4acb7a98e )
John Spray [Fri, 1 Sep 2017 16:01:35 +0000 (12:01 -0400)]
common: used fixed size int for perf counter prio
...to avoid any ambiguity in allowed range and
make clear how to encode it down the wire.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
ba08fc1008d17aa7a5f285ea2705705ce1a0bda0 )
John Spray [Fri, 1 Sep 2017 16:00:59 +0000 (12:00 -0400)]
mgr: transmit perf counter prio to the mgr
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
f304f84cfbc22c1a54d152cc38227077bc564a7e )
John Spray [Fri, 1 Sep 2017 14:46:56 +0000 (10:46 -0400)]
common: always include priority in perf counter dump
JSON output with inconsistent sets of members is
annoying to use on the receiving side.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
e631f1a72735ec618e2f3012ad7b9c5830d6c0eb )
John Spray [Tue, 29 Aug 2017 15:55:28 +0000 (11:55 -0400)]
mgr: add get_all_perf_counters to MgrModule interface
This is for use by modules that dump counters
in bulk, e.g. to a TSDB.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
9a42d4255d9d968d6162b53b71db292d9d3de2e4 )
Jan Fajerski [Fri, 11 Aug 2017 11:09:24 +0000 (13:09 +0200)]
pybind/mgr/prometheus: export cluster-wide pg stats, not per osd
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
13b1236b96d4563e0985cad40d3009b60cc475e7 )
Jan Fajerski [Fri, 11 Aug 2017 10:51:47 +0000 (12:51 +0200)]
pybind/mgr/prometheus: add more osd metadata
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
e7704fa9cc35549dba526212c2830df589670416 )
Jan Fajerski [Fri, 11 Aug 2017 10:05:09 +0000 (12:05 +0200)]
pybind/mgr/prometheus: don't get perf counters that are not in schema
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
d4ba07d04477ccae3a89dcdcafbb7e76149dfd1c )
Jan Fajerski [Fri, 11 Aug 2017 10:04:28 +0000 (12:04 +0200)]
pybind/mgr/prometheus: add mon and osd perf counters to export
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
fa25d31263a26074225e2a00cb82448066b54069 )
Jan Fajerski [Thu, 10 Aug 2017 17:46:07 +0000 (19:46 +0200)]
pybind/mgr/prometheus: add index page, export metrics under metrics/
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
d99a506ed37c2d0991d68ecd34ac5fb213a3eea4 )
Jan Fajerski [Thu, 10 Aug 2017 16:19:42 +0000 (18:19 +0200)]
pybind/mgr/prometheus: export selected perf_counters
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
f6e2e36ba72caf6347f3bb6a985925d0e35077a2 )
Jan Fajerski [Thu, 10 Aug 2017 16:18:36 +0000 (18:18 +0200)]
pybind/mgr/prometheus: export osd and pool metadata
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
2bea3814699c27baa8f633b56a8800d697685898 )
Jan Fajerski [Thu, 10 Aug 2017 16:15:56 +0000 (18:15 +0200)]
pybind/mgr/prometheus: actually emit reported pg counts
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
c288624eed862559b2c86c5dfc85c837716739ab )
Jan Fajerski [Thu, 10 Aug 2017 16:09:17 +0000 (18:09 +0200)]
pybind/mgr/prometheus: no need to wait for notify event
If stats or perf counters are not available they won't be emitted.
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
ead0973d7dd12fe985390891c80f1bc15f7b9aec )
Jan Fajerski [Thu, 10 Aug 2017 16:07:14 +0000 (18:07 +0200)]
pybind/mgr/prometheus: no need to convert perf_schema to ordered_dict
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
5e4b4b5ea2a217731691c1c391c252b08452798a )
Jan Fajerski [Wed, 9 Aug 2017 15:22:49 +0000 (17:22 +0200)]
pybind/mgr/prometheus: add device_class label to osd metrics
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
76d1918724320b7d6b1120b57b3002bb24099001 )
Jan Fajerski [Wed, 9 Aug 2017 14:19:38 +0000 (16:19 +0200)]
pybind/mgr/prometheus: add cluster wide metrics; no perf counters for now
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
49b3ff83cd231066d2a8f1809fadbdeb2c0c1f88 )
Jan Fajerski [Fri, 4 Aug 2017 08:23:11 +0000 (10:23 +0200)]
pybind/mgr/prometheus: prefix metrics with 'ceph'; replace :: with _
Both follow prometheus best practices. While : is a legal metric
character, "Exposed metrics should not contain colons, these are for
users to use when aggregating."
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
177afcc7886aa3898d092ebd1e101697bc6539fd )
mhdo2 [Mon, 21 Aug 2017 16:13:01 +0000 (12:13 -0400)]
doc/mgr: add influx plugin docs
Signed-off-by: My Do <mhdo@umich.edu>
(cherry picked from commit
e345fe3c5780976a4e33488b3a75cd24bb2c96c5 )
mhdo2 [Tue, 18 Jul 2017 22:33:55 +0000 (18:33 -0400)]
mgr/influx: added influx plugin
Signed-off-by: My Do <mhdo@umich.edu>
(cherry picked from commit
68ae26c014d0471cc3f2f979dc8d822b2e50740f )
John Spray [Sat, 23 Sep 2017 15:55:55 +0000 (11:55 -0400)]
mgr: store declared_types in MgrSession
Because we don't (yet) properly prevent multiple sessions
from daemons reporting the same name (e.g. rgws), storing
it in the DaemonPerfCounters meant that one daemon's report
was referring to another daemon's set of reported types.
This should always have been a property of the session.
The behaviour will still be ugly when multiple daemons
are using the same name (stomping on each other's stats/statsu)
but it shouldn't crash.
Fixes: http://tracker.ceph.com/issues/21197
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
dc415f1ae09a308bd448614934a4c168eb9cf07b )
John Spray [Mon, 18 Sep 2017 09:12:00 +0000 (10:12 +0100)]
mgr: make pgmap_ready atomic to avoid taking lock
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
d20915741d985e080a723cd6563bc6f4a657276f )
John Spray [Mon, 28 Aug 2017 11:29:36 +0000 (07:29 -0400)]
mgr/DaemonServer: handle MMgrReports in parallel
The DaemonStateIndex locking is sufficient to make all
the report processing safe: holding DaemonServer::lock
through all ms_dispatch was unnecessarily serializing
dispatch.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
64af9d3da0fceff9ad0ff668f60d272c46912f34 )
John Spray [Thu, 24 Aug 2017 16:53:24 +0000 (12:53 -0400)]
mgr: clean up DaemonStateIndex locking
Various things here were dangerously operating
outside locks.
Additionally switch to a RWLock because this lock
will be relatively read-hot when it's taken every time
a MMgrReport is handled, to look up the DaemonState
for the sender.
Fixes: http://tracker.ceph.com/issues/21158
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
806f10847cefe5c7a78fc319b1b130d372197dd3 )
John Spray [Thu, 31 Aug 2017 16:13:23 +0000 (12:13 -0400)]
mgr: runtime adjustment of perf counter threshold
ceph-mgr has missed out on the `config set` command
that the other daemons got recently: add it here
and hook it all up to the stats period and threshold
settings.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
057b73d641decb9403aba50caae9d139f3a34dd4 )
John Spray [Mon, 31 Jul 2017 13:24:09 +0000 (09:24 -0400)]
mgr: apply a threshold to perf counter prios
...so that we can control the level of load
we're putting on ceph-mgr with perf counters. Don't collect
anything below PRIO_USEFUL by default.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit
bdc775fdd8acdad5c58ff3065a21396f80ce5db4 )
Sage Weil [Tue, 8 Aug 2017 20:36:23 +0000 (16:36 -0400)]
pybind/mgr/balancer: make auto mode work
(with upmap at least)
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
ef1a3be05671ad31907cf8c4beb64a766359bc66 )
Spandan Kumar Sahu [Sun, 6 Aug 2017 22:31:57 +0000 (04:01 +0530)]
src/pybind/mgr/balancer/module.py: improve scoring method
* score lies in [0, 1), 0 being perfect distribution
* use shifted and scaled cdf of normal distribution
to prioritize highly over-weighted device.
* consider only over-weighted devices to calculate score
Signed-off-by: Spandan Kumar Sahu <spandankumarsahu@gmail.com>
(cherry picked from commit
c09308c49ca087fb8c5e7d4261b0234190f863d9 )
Sage Weil [Fri, 4 Aug 2017 21:59:20 +0000 (17:59 -0400)]
pybind/mgr/balancer: make 'crush-compat' sort of work
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
7a00e02acd1b2ff21dac829de30f80fd69eae602 )
Sage Weil [Thu, 3 Aug 2017 20:23:08 +0000 (16:23 -0400)]
pybind/mgr/balancer: rough framework
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
d5e5c68c374e7d5514f89aac2d3df6008d103a76 )
Sage Weil [Fri, 28 Jul 2017 03:33:06 +0000 (23:33 -0400)]
mgr/PyOSDMap: OSDMap.map_pool_pgs_up, CRUSHMap.get_item_name
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
a928bf62316c32f37dd1791192fd9a2ddaef0d33 )
Sage Weil [Sun, 23 Jul 2017 04:10:56 +0000 (00:10 -0400)]
mgr/PyOSDMap: get_crush, find_takes, get_take_weight_osd_map
These let us identify distinct CRUSH hierarchies that rules distribute
data over, and create relative weight maps for the OSDs they map to.
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
3b8a276c437cfd599c55a935d141375afda676ff )
Sage Weil [Thu, 27 Jul 2017 14:07:31 +0000 (10:07 -0400)]
crush/CrushWrapper: rule_has_take
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
ef140de639078b40c05971fb219f7b8c12d83228 )
Sage Weil [Sun, 23 Jul 2017 03:50:27 +0000 (23:50 -0400)]
crush/CrushWrapper: refactor get_rule_weight_osd_map to work with roots too
Allow us to specify a root node in the hierarchy instead of a rule.
This way we can use it in conjunction with find_takes().
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
69454e0570274ff7f252e7f081965dcc9bb04459 )
Sage Weil [Sun, 23 Jul 2017 03:17:18 +0000 (23:17 -0400)]
pybind/mgr/balancer: do upmap by pool, in random order
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
028a66d43244c15a77e71f3d3e4f41773837ab02 )
Sage Weil [Tue, 11 Jul 2017 20:27:08 +0000 (16:27 -0400)]
pybind/mgr/balancer: add balancer module
- wake up every minute
- back off when unknown, inactive, degraded
- throttle against misplaced ratio
- apply some optimization step
- initially implement 'upmap' only
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
0d9685c50f79fbb53dbc8bd98c95900ef6e902b8 )
Sage Weil [Tue, 11 Jul 2017 20:26:16 +0000 (16:26 -0400)]
pybind/mgr/mgr_module: add default arg to get_config
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
39c42ddb9339c1950a3a474e8083db8b24e775a6 )
Sage Weil [Tue, 11 Jul 2017 03:23:19 +0000 (23:23 -0400)]
mgr: add trivial OSDMap wrapper class
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
2ef005196ba2eb49c34c32def624938c7a8beb03 )
Sage Weil [Thu, 27 Jul 2017 14:06:45 +0000 (10:06 -0400)]
mgr/PyModules: add 'pg_dump' get
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
bfb9286f4212947183c46543d609b664ea13b489 )
Sage Weil [Tue, 11 Jul 2017 20:25:42 +0000 (16:25 -0400)]
mgr/PyModules: add 'pg_status' dump
This is summary info, same as what's in 'ceph status'.
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
85b5b80906d00e098d4b1af1354c60a357022dd2 )
Casey Bodley [Wed, 1 Nov 2017 20:28:25 +0000 (16:28 -0400)]
Merge pull request #18674 from ceph/wip-rgw-s3-branch
qa/tests: use ceph-luminous branch for s3tests
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Vasu Kulkarni [Wed, 1 Nov 2017 17:32:07 +0000 (10:32 -0700)]
qa: use ceph-luminous branch for s3tests
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>