Sage Weil [Wed, 1 Aug 2018 01:16:30 +0000 (20:16 -0500)]
Merge PR #23223 into master
* refs/pull/23223/head:
osd/PG: kill dead functions and related options
iosd/osd_type: kill unused input ec_pool for iterate_mayberw_back_to
common: kill dead options
osd/PG: do not initialize up/acting twice
osd/PG: clear missing_loc properly if last location is gone
Sage Weil [Tue, 31 Jul 2018 22:23:48 +0000 (17:23 -0500)]
Merge PR #22692 into master
* refs/pull/22692/head:
doc/mgr/devicehealth: document devicehealth module
doc/rados/operations/health-checks: document DEVICE_HEALTH* messages
mgr/devicehealth: fix style for returns
mgr/devicehealth: use constants for health warnings
mgr/devicehealth: deal with as many daemons as we can until limit
mgr/devicehealth: warn if too many daemons are expected to fail soon
mgr/devicehealth: set primary-affinity 0 for failing devices
msg/devicehealth: fix config options
mgr/devicehealth: only fetch osdmap once from check_health
mgr/devicehealth: revise health messages
mgr/devicehealth: add 'device check-health' command and run periodically
mgr/devicehealth: fix new options
mgr/devicehealth: add helpers to life_expectancy_response()
mgr/devicehealth: simplify setting defaults
common/blkdev remove debug statements
Yaarit Hatuka [Mon, 25 Jun 2018 13:19:22 +0000 (08:19 -0500)]
mgr/devicehealth: add helpers to life_expectancy_response()
- if mark_out_threshold is met we write to log.warn instead of raising a
health warning.
- check that OSD is 'in' before calling mark_out().
- raise a health warning in case OSD is marked 'out' but still has PGs
attached to it.
- cast thresholds default values to string.
- add SCSI multipath support to health warning message.
- change health warning message.
src/osd/PG.cc: remove redundant call to trim_log()
This change is motived by the failure tracked in
https://tracker.ceph.com/issues/25198. The failure highlights a case, when a
call to trim_log() after the PG has recovered, races with the previous op,
on a replica OSD. Since the previous operation has not completed, the
last_complete value for that OSD is not valid, when we try to trim the
log. It is also worth noting that the race is due to MOSDPGTrim going through
the strict queue as a peering message vs regular ops going through the
non-strict queue.
During the investigation of this bug, we noticed that, with
https://tracker.ceph.com/issues/23979, we allow pg log trimming to
happen on the primary and replicas, whenever we cross the upper bound of
the pg log. This also ensures that pg log trimming happens while processing
any new op.
Therefore, the function trim_log(), which earlier served the purpose of
trimming logs on the primary and replicas, just before the PG went into
the Recovered state, is no more required. This acted like a last line of
defense to trim logs, when we did not need the logs any more. But, this call
seems redundant now, because, we are limiting the pg log length at all times.
Sage Weil [Mon, 30 Jul 2018 19:18:07 +0000 (14:18 -0500)]
pybind/rados/rados: do not pass prval from stack
The prval is a pointer to an int to write the final completion code of
the rados op. This can't be on the stack since we immediately leave the
current scope after preparing the op (looong before we do the rados op).
We keep the tuple return value to avoid breaking users of this API
(devicehealth module, gnocchi at a minimum).
Fixes: http://tracker.ceph.com/issues/25175 Signed-off-by: Sage Weil <sage@redhat.com>
cmake,make-dist: build gperftools if WITH_STATIC_LIBSTDCXX
we could create a mini project to build a shared library, and use
try_compile() to test if the found gperftools is compiled with -fPIC.
but as we are targeting mostly xenial when enabling
WITH_STATIC_LIBSTDCXX, and google-perftools on xenial by default
is built without -fPIC. so let's keep it simple.
- do not link libkv with ALLOC_LIBS, it turns out that if we link
tcmalloc *before* -static-libstdc++ -static-libgcc, libstdc++ and gcc
libs will show up in `ldd` output
- add `-static-libstdc++ -static-libgcc` to CMAKE_SHARED_LINKER_FLAGS
and CMAKE_EXE_LINKER_FLAGS instead of adding them to all shared
libraries and executable. simpler this way.
- link against libtcmalloc statically, because libtcmalloc is a C++
library, linking against it dynamically and linking against C++ runtime
statically will pull in depdencies on two versions of C++ runtime, which
will bring down the app at run-time.
- do not pass '-pie' to linker when building executable if
`WITH_STATIC_LIBSTDCXX` and tcmalloc is used, because the static tcmalloc
is not compiled with PIC.
- only apply '-pie' if ENABLE_SHARED is enabled.
crimson/common: write configs synchronously on shard.0
to avoid potential racings on the same shard. before this change, we
apply the change in async. after this change, all changes happens on the
owner shard (i.e. shard.0), and the changes are applied synchronously.
simpler this way, and this allows us to have more detailed error message
so we can present it to end-user. skipping the updating step if no
changes is made is nice to have, but changing settings is not in the
critical path. so let's keep it simple.
common/config: do not use magic number in set_mon_vals()
before this change, we compare the retcode of _set_val() with 1, and 0,
which are pratically magic numbers. after this change, we use
ConfigValues::set_value_result_t for non-error retcode.
Stephan Müller [Fri, 27 Jul 2018 14:15:07 +0000 (16:15 +0200)]
mgr/dashboard: Fix duplicate error messages
Duplicate error messages currently appear if the task wrapper service is
used. It calls 'notifyTask' on a failed task, this would be fine if
we didn't have the API interceptor, which watches all API requests and
triggers 'notifyTask' itself if an error appears.
seastar actually requires fmt 4.0.0 and up, as 3.0.2 does not offer
fmt/printf.h. see
https://github.com/fmtlib/fmt/blob/master/ChangeLog.rst#400---2017-06-27
.
Douglas Fuller [Fri, 29 Jun 2018 17:55:31 +0000 (13:55 -0400)]
mon/OSDMonitor: Warn if missing expected_num_objects
When creating a pool on filestore, warn if the user appears to be
creating a pool to store a large number of objects but omitted the
expected_num_objects parameter. Create the pool anyway.
Fixes: http://tracker.ceph.com/issues/24687 Signed-off-by: Douglas Fuller <dfuller@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1603615 indicates
a case when pg calc conflicts with mon_max_pg_per_osd, and does not
allow pool creation when this limit is 200. Hence, increase this limit
to avoid this.