Sage Weil [Fri, 4 Jan 2019 02:38:14 +0000 (20:38 -0600)]
Merge PR #25623 into master
* refs/pull/25623/head:
common/ceph_time: 'mo' for month
common/options: use new parse_timespan
common/ceph_time: add parse_timespan
common/config_proxy: pass err_ss through on set_val
common/ceph_time: add exact_timespan_str
Sage Weil [Fri, 4 Jan 2019 02:11:11 +0000 (20:11 -0600)]
Merge PR #25672 into master
* refs/pull/25672/head:
osd: OSD device smart data include additional nvme data
common/blkdev: add missing get_device_id impl
os/bluestore,filestore: use get_raw_devices
osd: update metadata and smart code to report get_device_id errors
mon: update metadata and smart commands to use get_raw_devices
common/blkdev: add get_raw_devices helper
common/blkdev: fix BlkDev::get_devid when we got a devname, not fd
common/blkdev: return optional error string from get_device_id
common/blkdev: refactor to add block_device_get_metrics returning json
hsiang41 [Fri, 28 Dec 2018 09:07:32 +0000 (17:07 +0800)]
osd: OSD device smart data include additional nvme data
Add nvme addition data into the deveh health data. That use nvme tool
and command syntax "nvme <vendor> smart-log-add <dev> -json". The nvme
json output append in the dev smart "nvme_smart_health_information_add_log".
- made run_smartctl static/private
- changed get_metrics to take a const string, not c str
Signed-off-by: Rick Chen <rick.chen@prophetstor.com> Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Thu, 3 Jan 2019 15:03:39 +0000 (09:03 -0600)]
Merge PR #25736 into master
* refs/pull/25736/head:
common/options: document some osd/rados options
unittest_osdmap: feed options as defaults
mon/OSDMonitor: allow osd_pool_default_pgp_num to be 0
Kefu Chai [Thu, 3 Jan 2019 11:00:19 +0000 (19:00 +0800)]
pybind/rgw: pass the flags to callback function
before this change, the `flags` parameter passed to `LibRGWFS.readdir()`
will be dropped on the floor and ignored.
after this change, it will be passed to the specified callback function.
Yingxin [Mon, 17 Dec 2018 13:51:49 +0000 (21:51 +0800)]
crimson/net: fix address learning during banner exchange
* Don't store my_addr in `Connection`, because my_addr can be learned
and thus changed.
* Support nonce in SocketMessenger.
* Always set nonce when set_myaddr().
* Add learned_addr() for SocketMessenger.
* Add side_t and socket_port to show the real connecting
ports of the SocketConnection.
* Fix bannder exchange logic for addresses, including nonce, type, ip,
port, socket_port for my_addr and peer_addr.
* Add more detailed logging prefixes for SocketConnection.
Jianpeng Ma [Thu, 3 Jan 2019 06:34:43 +0000 (14:34 +0800)]
ceph_argparse: make command ceph acceph SIGINT.
If no --connect-timeout point and there is no ceph.conf, the command
'ceph -s' can't stop by ctrl+c.
This introduced by commit 4d8fc26c8 Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Kefu Chai [Thu, 20 Dec 2018 03:24:28 +0000 (11:24 +0800)]
cls/rbd: init local var with known value
DirectoryState does not have an "invalid" enum so far, since it's
defined using `enum class`, init a value of this type with a known value
could be a better choice even it is always initialized before being read.
this silences the GCC warning of:
src/cls/rbd/cls_rbd.cc:3147:3: warning: ‘on_disk_directory_state’ may be
used uninitialized in this function [-Wmaybe-uninitialized]
if (directory_state != on_disk_directory_state) {
^~
The stats entries for rgw buckets has a category, which used a
combination of uint8_t and enum RGWObjClass. Clean this up by
converting RGWObjClass to an enum class and using that
throughout. This provides type safety and better code clarity. Also,
add some source code documentation.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
Kefu Chai [Wed, 2 Jan 2019 16:20:13 +0000 (00:20 +0800)]
osd/OSDMap: set pg_autoscale_mode with setting from conf
* update build_simple_optioned() to set pg_autoscale_mode with the setting
read from conf, otherwise it will be a random value in heap.
* update cli test accordingly, otherwise we will have
Venky Shankar [Mon, 5 Nov 2018 05:51:53 +0000 (00:51 -0500)]
mds: dump scrub formatted output when context completion
Include scrub tag as part of the output and move the
formatting in context completion to support scrub opeation
triggered via tell interface (introduced later).
Venky Shankar [Mon, 5 Nov 2018 05:49:44 +0000 (00:49 -0500)]
mds: generate random scrub tag when empty
With this, scrub operations are tagged with a random
uuid if tag is unspecified. This also helps to show
in-progress scrub operations via "scrub status" command
(introduced in later commits).
Venky Shankar [Mon, 5 Nov 2018 05:10:15 +0000 (00:10 -0500)]
mds: introduce C_ExecAndReply context completion class
Tell commands that need to asynchronous reply back can
subclass C_ExecAndReply() and implement exec() virtual
function to support asynchronous execution of the command
(via finisher thread).
Kefu Chai [Tue, 1 Jan 2019 08:38:15 +0000 (16:38 +0800)]
crimson: workaround an ICE of GCC
this change works around the FTBFS on arm64:
/home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/crimson/common/config_proxy.h:74:13:
internal compiler error: in tsubst_decomp_names, at cp/pt.c:16537
for (auto& [obs, keys] : rev_obs) {
^~~
Please submit a full bug report,
with preprocessed source if appropriate.
it seems that this issue is a dup of
https://bugzilla.redhat.com/show_bug.cgi?id=1639019 .
Kefu Chai [Tue, 1 Jan 2019 07:49:05 +0000 (15:49 +0800)]
librbd: workaround an ICE of GCC
GCC is somehow annoyed at seeing the combination of decltype and
initializer_list in this place. i tried to remove the `if` clause, and
only left the `else` block, GCC was happy with that change. i also tried
to pass an empty `{}` to `decltype(reply.lockers)`, and GCC was also
happy with that. so i guess there are multiple factors taking effect in
this problem. probably any of them could be the last straw that breaks
GCC.
but we cannot have a minimal reproducer for this issue here without more
efforts. and `reply.lockers` is empty after `reply` is constructed, so
it would be simpler if we just add the locker info to it instead of
assigning a newly constructed `map` to it.
Sage Weil [Tue, 1 Jan 2019 04:31:18 +0000 (22:31 -0600)]
Merge PR #25597 into master
* refs/pull/25597/head:
mgr/hello: define some module options
pybind/mgr/mgr_module: normalize defaults to str when type is missing
mgr/telegraf: specify option types
mgr/telemetry: specify option types
mgr/zabbix: specify option types
mgr/localpool: document options and specify in native types
mgr/devicehealth: document options and specify in native types
mgr/balancer: document and specify options
common/options: make runtime vs not runtime explicit, not type-dependent
mon/MgrMonitor: mark module options with FLAG_MGR
mon/MgrMonitor: make find_module_option handle localized option names
pybind/mgr/mgr_module: set values as string
mgr: return options as appropriate python type
mgr/PythonCompat: python 3 cludges
pybind/mgr/mgr_module: make use of defined default value
mgr/PyModule: populate ModuleOptions and expose to mon
mon/ConfigMonitor: unify module options with built-in options
mon/MgrMonitor: populate options
mon/MgrMap: define ModuleOptions and include in map
common/options: expand type helpers and make them static
common/options: pin enums to values
Reviewed-by: Tim Serong <tserong@suse.com> Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Sage Weil [Tue, 1 Jan 2019 04:30:40 +0000 (22:30 -0600)]
Merge PR #25652 into master
* refs/pull/25652/head:
osd/OSDMap: disallow new upmaps on pgs that are pending merge
osd/PG: align past_intervals and last_epoch_clean for fabricated merge target
Sage Weil [Mon, 31 Dec 2018 17:05:03 +0000 (11:05 -0600)]
osd: reliably send pg_created messages to the mon
The OSD has to reliably deliver a pg_created message to the mon in order
for the mon to clear the pool's CREATING flag. Previously, a mon
connection reset would drop the message.
Restructure this to:
- queue a message any time a PG peers and the pool as the CREATING flag
- track pending messages in OSDService
- resend on mon connect
- prune messages for pools that no longer have the CREATING flag
This new strategy can result in resends of these messages to the mon in
cases where the mon already knows the PG was created. However, pool
creation is rare, and these extra messages are cheap. And we can avoid
this overhead if we like by limiting the number of PGs that the mon can
create explicitly if we choose (by lowering mon_osd_max_initial_pgs).
Fixes: http://tracker.ceph.com/issues/37775 Signed-off-by: Sage Weil <sage@redhat.com>
Kefu Chai [Sun, 30 Dec 2018 13:57:04 +0000 (21:57 +0800)]
osd: unlock osd_lock when tweaking osd settings
unlock osd_lock when serving "debug kick_recovery_wq" command
we need to unlock osd_lock temporarily when updating the osd settings,
otherwise we will run into assert failure. because
OSD::handle_conf_change() acquires the osd_lock which is not a recursive
lock.
Kefu Chai [Sun, 30 Dec 2018 13:46:55 +0000 (21:46 +0800)]
osd: use unlock_guard for unlock osd temporarily
when OSD::do_command() gets called, osd_lock is acquired. but when
serving some of these commands, we need to call methods which also
acquire the osd_lock by themselves. for instance,
OSD::handle_conf_change() gets called by cct->_conf.apply_changes().
to allow them to do so, we unlock osd_lock before calling those methods,
and re-lock it after done with them.
unlock_guard is introduced to unlock and re-lock the lock in a RAII style.