]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
6 years agoMerge PR #24761 into nautilus
Sage Weil [Sat, 27 Oct 2018 03:07:27 +0000 (22:07 -0500)]
Merge PR #24761 into nautilus

* refs/pull/24761/head:
osd: fix race between op_wq and context_queue

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Reviewed-by: Jianpeng Ma <jianpeng.ma@intel.com>
6 years agoMerge PR #24651 into nautilus
Sage Weil [Sat, 27 Oct 2018 02:07:09 +0000 (21:07 -0500)]
Merge PR #24651 into nautilus

* refs/pull/24651/head:
test: Make sure kill_daemons failure will be easy to find
test: Add flush_pg_stats to make test more deterministic

Reviewed-by: Neha Ojha <nojha@redhat.com>
6 years agoosd: fix race between op_wq and context_queue 24761/head
Sage Weil [Thu, 25 Oct 2018 19:24:02 +0000 (14:24 -0500)]
osd: fix race between op_wq and context_queue

        ThreadA                                                 ThreadB
  sdata->shard_lock.Lock();
  if (sdata->pqueue->empty() &&
     !(is_smallest_thread_index && !sdata->context_queue.empty())) {

    void queue(list<Context *>& ls) {
        bool empty = false;
                                                                       {
                                                                         std::scoped_lock l(q_mutex);
                                                                         if (q.empty()) {
                                                                           q.swap(ls);
                                                                           empty = true;
                                                                         } else {
                                                                           q.insert(q.end(), ls.begin(), ls.end());
                                                                         }
                                                                       }

                                                                       if (empty) {
                                                                         mutex.Lock();
                                                                         cond.Signal();
                                                                         mutex.Unlock();
                                                                       }
                                                                    }

     sdata->sdata_wait_lock.Lock();
    if (!sdata->stop_waiting) {

Fix by simply rechecking that context_queue is empty after taking the
wait lock.  We still check it without taking that lock to keep the hot/busy
path fast (we avoid the wait lock in general) at the expense of taking
the context_queue qlock twice in the idle/wait path (where we don't care
so much about additional latency/cycles).

Fixes: http://tracker.ceph.com/issues/36473
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Signed-off-by: Sage Weil <sage@redhat.com>
6 years agoMerge PR #24725 into nautilus
Patrick Donnelly [Wed, 24 Oct 2018 22:59:30 +0000 (15:59 -0700)]
Merge PR #24725 into nautilus

* refs/pull/24725/head:
mds: add missing mds_lock

Reviewed-by: Sage Weil <sage@redhat.com>
6 years agoqa/tasks/qemu: use unique clone directory to avoid race with workunit
Jason Dillaman [Mon, 22 Oct 2018 14:44:40 +0000 (10:44 -0400)]
qa/tasks/qemu: use unique clone directory to avoid race with workunit

If there is a workunit task associated with the same client, the two
tasks will attempt to clone the suite repo to the same directory.
Worse, if it's parallel tasks, the two clones will clobber each
other.

Fixes: http://tracker.ceph.com/issues/36542
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 5d56014c61b107dcb5d05c2221c2e844324f304c)

6 years agomds: add missing mds_lock 24725/head
Patrick Donnelly [Tue, 23 Oct 2018 22:20:09 +0000 (15:20 -0700)]
mds: add missing mds_lock

Fixes: http://tracker.ceph.com/issues/36573
Introduced-by: ecbd4a8aa8e6c1c72af4e0be15e0340629bfdc3a
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
6 years agoMerge PR #24697 into nautilus 24698/head
Sage Weil [Tue, 23 Oct 2018 01:45:40 +0000 (20:45 -0500)]
Merge PR #24697 into nautilus

* refs/pull/24697/head:
ceph_test_msgr: fix authorizer behavior

Reviewed-by: Gregory Farnum <gfarnum@redhat.com>
6 years agoceph_test_msgr: fix authorizer behavior 24697/head
Sage Weil [Mon, 22 Oct 2018 15:00:28 +0000 (10:00 -0500)]
ceph_test_msgr: fix authorizer behavior

Fixes breakage from this PR 2152d8ffb73a507a3d08d48b38c5a8e73f887138.

Fixes: http://tracker.ceph.com/issues/36495
Signed-off-by: Sage Weil <sage@redhat.com>
6 years agoMerge pull request #24667 from liewegas/wip-ec-thrash-full
Josh Durgin [Mon, 22 Oct 2018 14:39:26 +0000 (07:39 -0700)]
Merge pull request #24667 from liewegas/wip-ec-thrash-full

qa/suites/rados/thrash-erasure-code*/thrashers/*: less likely resv rejection injection

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
6 years agoMerge PR #24689 into nautilus
Sage Weil [Mon, 22 Oct 2018 14:20:50 +0000 (09:20 -0500)]
Merge PR #24689 into nautilus

* refs/pull/24689/head:
qa/tasks/ceph_manager: fix get_stuck_pgs from pg dump change

Reviewed-by: Kefu Chai <kchai@redhat.com>
6 years agoqa/tasks/ceph_manager: fix get_stuck_pgs from pg dump change 24689/head
Sage Weil [Sun, 21 Oct 2018 15:52:38 +0000 (10:52 -0500)]
qa/tasks/ceph_manager: fix get_stuck_pgs from pg dump change

Fixes 95b7d2340c04dc7cf90085c89606b8c85a8f2803

Fixes: http://tracker.ceph.com/issues/36485
Signed-off-by: Sage Weil <sage@redhat.com>
6 years agoqa/suites/rados/thrash-erasure-code*/thrashers/*: less likely resv rejection injection 24667/head
Sage Weil [Thu, 18 Oct 2018 22:15:08 +0000 (17:15 -0500)]
qa/suites/rados/thrash-erasure-code*/thrashers/*: less likely resv rejection injection

For EC pools we have a lot of shards, and 30% probability on each one
means we are very like to repeatedly fail backfill reservations.. long
enough that teuthology gives up waiting.

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agotest: Make sure kill_daemons failure will be easy to find 24651/head
David Zafman [Mon, 1 Oct 2018 18:17:45 +0000 (11:17 -0700)]
test: Make sure kill_daemons failure will be easy to find

Signed-off-by: David Zafman <dzafman@redhat.com>
6 years agotest: Add flush_pg_stats to make test more deterministic
David Zafman [Mon, 24 Sep 2018 21:35:39 +0000 (14:35 -0700)]
test: Add flush_pg_stats to make test more deterministic

Signed-off-by: David Zafman <dzafman@redhat.com>
6 years agoMerge PR #24625 into nautilus
Sage Weil [Wed, 17 Oct 2018 14:46:18 +0000 (09:46 -0500)]
Merge PR #24625 into nautilus

* refs/pull/24625/head:
qa/suites/rados/mgr/tasks/module_selftest: whitelist 'foo bar security'

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@gmail.com>
6 years agoqa/suites/rados/mgr/tasks/module_selftest: whitelist 'foo bar security' 24625/head
Sage Weil [Tue, 16 Oct 2018 22:40:25 +0000 (17:40 -0500)]
qa/suites/rados/mgr/tasks/module_selftest: whitelist 'foo bar security'

Avoid failures like

"2018-10-16 20:36:00.437153 mgr.y (mgr.25609) 6 : cluster [SEC] foo bar security" in cluster log

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agoMerge pull request #24629 from cyx1231st/wip-seastar-fix
Kefu Chai [Wed, 17 Oct 2018 11:22:03 +0000 (19:22 +0800)]
Merge pull request #24629 from cyx1231st/wip-seastar-fix

crimson/net: fix compile errors in test_alien_echo.cc

Reviewed-by: Kefu Chai <kchai@redhat.com>
6 years agocrimson/net: fix compile errors 24629/head
Yingxin [Wed, 17 Oct 2018 15:12:45 +0000 (23:12 +0800)]
crimson/net: fix compile errors

test_alien_echo.cc is using common_init_finish, but it is not available
if defined WITH_SEASTAR.

Dispatcher::ms_verify_authorizer() is removed by PR#24095.

Signed-off-by: Yingxin <yingxin.cheng@intel.com>
6 years agoMerge pull request #24607 from p-na/fix-osd-list-status-labels
Lenz Grimmer [Wed, 17 Oct 2018 10:59:07 +0000 (12:59 +0200)]
Merge pull request #24607 from p-na/fix-osd-list-status-labels

mgr/dashboard: Fix spaces around status labels on OSD list

Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Ricardo Marques <rimarques@suse.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
6 years agoMerge pull request #24614 from s0nea/wip-dashboard-configs-textarea-vertical-resize
Lenz Grimmer [Wed, 17 Oct 2018 10:57:55 +0000 (12:57 +0200)]
Merge pull request #24614 from s0nea/wip-dashboard-configs-textarea-vertical-resize

mgr/dashboard: configs textarea disallow horizontal resize

Reviewed-by: Ricardo Marques <rimarques@suse.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
6 years agoMerge pull request #24631 from Devp00l/wip-issue-36466
Lenz Grimmer [Wed, 17 Oct 2018 10:56:21 +0000 (12:56 +0200)]
Merge pull request #24631 from Devp00l/wip-issue-36466

mgr/dashboard: Add left padding to helper icon

Reviewed-by: Ricardo Marques <rimarques@suse.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
6 years agomgr/dashboard: Add left padding to helper icon 24631/head
Stephan Müller [Mon, 8 Oct 2018 15:07:11 +0000 (17:07 +0200)]
mgr/dashboard: Add left padding to helper icon

Fixes: https://tracker.ceph.com/issues/36466
Signed-off-by: Stephan Müller <smueller@suse.com>
6 years agoMerge PR #24578 into master
Sage Weil [Tue, 16 Oct 2018 22:57:24 +0000 (17:57 -0500)]
Merge PR #24578 into master

* refs/pull/24578/head:
pybind/ceph_argparse.py: do not create file for validating CephFilepath

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
6 years agoMerge PR #24565 into master
Sage Weil [Tue, 16 Oct 2018 22:56:59 +0000 (17:56 -0500)]
Merge PR #24565 into master

* refs/pull/24565/head:
mgr: python 3 compat fixes

Reviewed-by: John Spray <john.spray@redhat.com>
Reviewed-by: Sebastian Wagner <swagner@suse.com>
6 years agoMerge PR #24558 into master
Sage Weil [Tue, 16 Oct 2018 22:56:43 +0000 (17:56 -0500)]
Merge PR #24558 into master

* refs/pull/24558/head:
pybind/mgr: Fix Python 3 imports in diskprediction & insights

Reviewed-by: Noah Watkins <nwatkins@redhat.com>
6 years agoMerge PR #24528 into master
Sage Weil [Tue, 16 Oct 2018 22:55:59 +0000 (17:55 -0500)]
Merge PR #24528 into master

* refs/pull/24528/head:
common/condition_variable_debug: fix wait hooks
common/mutex_debug: remove no-op before/after hooks
common/mutex_debug: do lockdep post-lock step in caller
os/bluestore: {Mutex,Cond} -> ceph::{mutex,condition_variable}
os/bluestore: std::recursive_mutex -> ceph::recursive_mutex
os/bluestore: re-add is_locked assert
os/bluestore: std::{mutex,condition_variable} -> ceph::{mutex,condition_variable}
os/bluestore: use deduction for lock_guard<>, unique_lock<>

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Adam Kupczyk <akucpzyk@redhat.com>
6 years agoMerge PR #24151 into master
Sage Weil [Tue, 16 Oct 2018 22:43:02 +0000 (17:43 -0500)]
Merge PR #24151 into master

* refs/pull/24151/head:
mgr/devicehealth: use is_valid_daemon_name helper
mgr/devicehealth: generalize to mon and osd daemons
mon: implement 'smart [devid]' tell command
mgr: parse mon metadata properly
mon: report device id used by mon
common/blkdev: add get_device_by_path
common/blkdev: migrate block_device_run_smartctl from OSD.cc

Reviewed-by: John Spray <john.spray@redhat.com>
6 years agoMerge PR #24566 into master
Sage Weil [Tue, 16 Oct 2018 19:34:19 +0000 (14:34 -0500)]
Merge PR #24566 into master

* refs/pull/24566/head:
osd,mon: keep last_epoch_started along with last_epoch_clean premerge

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
6 years agoMerge PR #24598 into master
Sage Weil [Tue, 16 Oct 2018 18:41:28 +0000 (13:41 -0500)]
Merge PR #24598 into master

* refs/pull/24598/head:
.github/stale.yml: configure probot/stale to automatically close stale issues

Reviewed-by: Erwan Velu <erwan@redhat.com>
Reviewed-by: Piotr Dałek <piotr.dalek@corp.ovh.com>
Reviewed-by: Lenz Grimmer <lgrimmer@suse.com>
Reviewed-by: John Spray <john.spray@redhat.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Gregory Farnum <gfarnum@redhat.com>
6 years agoMerge pull request #24619 from tchaikov/wip-crimson-mon-client-better-encapsulation
Kefu Chai [Tue, 16 Oct 2018 16:45:19 +0000 (00:45 +0800)]
Merge pull request #24619 from tchaikov/wip-crimson-mon-client-better-encapsulation

crimson/mon: move mon::Connection into .cc

Reviewed-by: Casey Bodley <cbodley@redhat.com>
6 years agocrimson/mon: move mon::Connection into .cc 24619/head
Kefu Chai [Tue, 16 Oct 2018 15:24:05 +0000 (23:24 +0800)]
crimson/mon: move mon::Connection into .cc

Signed-off-by: Kefu Chai <kchai@redhat.com>
6 years agoMerge pull request #23849 from tchaikov/wip-crimson-monc
Neha Ojha [Tue, 16 Oct 2018 15:10:36 +0000 (08:10 -0700)]
Merge pull request #23849 from tchaikov/wip-crimson-monc

crimson: add MonClient

Reviewed-by: Casey Bodley <cbodley@redhat.com>
6 years agoMerge PR #24505 into master
Patrick Donnelly [Tue, 16 Oct 2018 14:52:54 +0000 (07:52 -0700)]
Merge PR #24505 into master

* refs/pull/24505/head:
mds: wait shorter intervals if beacon not sent

Reviewed-by: Zheng Yan <zyan@redhat.com>
6 years agoMerge pull request #24373 from mogeb/build-cls-rbd
Kefu Chai [Tue, 16 Oct 2018 14:12:53 +0000 (22:12 +0800)]
Merge pull request #24373 from mogeb/build-cls-rbd

osd: add required cls_* libraries as dependencies of osd

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Mykola Golub <mgolub@suse.com>
6 years agomds: wait shorter intervals if beacon not sent 24505/head
Patrick Donnelly [Tue, 9 Oct 2018 22:50:22 +0000 (15:50 -0700)]
mds: wait shorter intervals if beacon not sent

MDS beacon upkeep always waits mds_beacon_interval seconds even when laggy.
Check more frequently for when we stop being laggy to reduce likelihood that
the MDS is removed.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
6 years agoMerge PR #24095 into master
Sage Weil [Tue, 16 Oct 2018 12:20:17 +0000 (07:20 -0500)]
Merge PR #24095 into master

* refs/pull/24095/head:
osd: do not authenticate heartbeat connections until nautilus
msg,osd: enable unauthenticated Dispatcher for pre-nautilus OSD compat
osd: add missing space in heartbeat debug output
mgr/DaemonServer: add missing return
msg/async: drop verify_authorizer wrapper
osd: authenticate ping sessions
msg/simple: remove verify_authorizer wrapper
msg: remove unused ms_verify_authorizer
msg/async: remove get_authorizer wrapper
msg/simple: remove get_authorizer wrapper
msg/Messenger: pull authenticator validation into Messenger
msg/Messenger: uninline ms_deliver_verify_authorizer
mgr/DaemonServer: expose keyring for authenticator verification
mon: expose keyring for msgr1 authentication
mds: expose keyring for authenticater verification
osd: expose keyring for authenticater verification
msg/Dispatcher: add ms_get_auth1_authorizer_keystore
mon: fix ref cycle breakage in handle_forward
mon: use MonOpRequest get_session() instead of PaxosServiceMessage's
mon: get session from MonOpRequest in handle_command
messages/MForward: drop unused ctor
mon: use ms_handle_authentication to parse caps
mon: kill Session::global_id and use Connection member instead
mgr/DaemonServer: move session setup into ms_handle_authentication
mds: move session setup into ms_handle_authentication
osd: move session setup into ms_handle_authentication
msg: new ms_handle_authentication, add fields to Connection

Reviewed-by: Ricardo Dias <rdias@suse.com>
6 years agoMerge PR #24579 into master
Sage Weil [Tue, 16 Oct 2018 12:17:59 +0000 (07:17 -0500)]
Merge PR #24579 into master

* refs/pull/24579/head:
qa/osd: fixup osd-rep-recov-eio.sh fails to parse pg dump

Reviewed-by: John Spray <john.spray@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
6 years agoMerge pull request #24597 from batrick/i36450
John Spray [Tue, 16 Oct 2018 12:09:08 +0000 (13:09 +0100)]
Merge pull request #24597 from batrick/i36450

qa: fix run call args

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: John Spray <john.spray@redhat.com>
Reviewed-by: Lenz Grimmer <lgrimmer@suse.com>
6 years agoMerge pull request #23983 from theanalyst/vstart-rgw-asok
Kefu Chai [Tue, 16 Oct 2018 11:15:35 +0000 (19:15 +0800)]
Merge pull request #23983 from theanalyst/vstart-rgw-asok

vstart: set admin socket for RGW in conf

Reviewed-by: Casey Bodley <cbodley@redhat.com>
6 years agomgr/dashboard: configs textarea disallow horizontal resize 24614/head
Tatjana Dehler [Tue, 16 Oct 2018 10:28:16 +0000 (12:28 +0200)]
mgr/dashboard: configs textarea disallow horizontal resize

The textarea allows horizontal and vertical resize by default. Only the
vertical resize is appropriate for this form.

Fixes: http://tracker.ceph.com/issues/36452
Signed-off-by: Tatjana Dehler <tdehler@suse.com>
6 years agomgr/dashboard: Fix spaces around status labels on OSD list 24607/head
Patrick Nawracay [Tue, 16 Oct 2018 10:03:48 +0000 (12:03 +0200)]
mgr/dashboard: Fix spaces around status labels on OSD list

Signed-off-by: Patrick Nawracay <pnawracay@suse.com>
6 years agoMerge pull request #24576 from cyx1231st/wip-seastar-msgr-refactor
Kefu Chai [Tue, 16 Oct 2018 08:44:16 +0000 (16:44 +0800)]
Merge pull request #24576 from cyx1231st/wip-seastar-msgr-refactor

crimson/net: seastar-msgr refactoring

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
6 years agoMerge pull request #24255 from gregsfortytwo/wip-crush-doc
Kefu Chai [Tue, 16 Oct 2018 05:49:19 +0000 (13:49 +0800)]
Merge pull request #24255 from gregsfortytwo/wip-crush-doc

doc: explain 'firstn v indep' in the CRUSH docs

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
6 years agoMerge PR #24286 into master
Patrick Donnelly [Tue, 16 Oct 2018 04:34:31 +0000 (21:34 -0700)]
Merge PR #24286 into master

* refs/pull/24286/head:
client: fix fuse client can't read or write data due its caps is invalid

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
6 years agoMerge PR #24292 into master
Patrick Donnelly [Tue, 16 Oct 2018 04:31:04 +0000 (21:31 -0700)]
Merge PR #24292 into master

* refs/pull/24292/head:
qa: add test for rctime on root inode
mds: set rctime on new system inode
mds: small refactor

Reviewed-by: Zheng Yan <zyan@redhat.com>
6 years agoMerge PR #24486 into master
Patrick Donnelly [Tue, 16 Oct 2018 04:23:19 +0000 (21:23 -0700)]
Merge PR #24486 into master

* refs/pull/24486/head:
client: explicitly show blacklisted state via asok status command

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
6 years agoMerge PR #24508 into master
Patrick Donnelly [Tue, 16 Oct 2018 04:12:56 +0000 (21:12 -0700)]
Merge PR #24508 into master

* refs/pull/24508/head:
cephfs-shell: fixup 'str' object has no attribute 'decode'

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
6 years agoMerge PR #24596 into master
Sage Weil [Mon, 15 Oct 2018 23:41:53 +0000 (18:41 -0500)]
Merge PR #24596 into master

* refs/pull/24596/head:
ptl-tool.py: move githubmap update into merge commit

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
6 years ago.github/stale.yml: configure probot/stale to automatically close stale issues 24598/head
Sage Weil [Mon, 15 Oct 2018 22:36:43 +0000 (17:36 -0500)]
.github/stale.yml: configure probot/stale to automatically close stale issues

Initially only warn about closing issues, but do not close them.

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agoqa: fix run call args 24597/head
Patrick Donnelly [Mon, 15 Oct 2018 21:44:45 +0000 (14:44 -0700)]
qa: fix run call args

Fixes: http://tracker.ceph.com/issues/36450
Introduced-by: 95746ecce9215c8428a02f1745d03e10536a4129
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
6 years agoMerge PR #24562 into master
Patrick Donnelly [Mon, 15 Oct 2018 21:34:21 +0000 (14:34 -0700)]
Merge PR #24562 into master

* refs/pull/24562/head:
removed warning for resolved errata

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
6 years agoMerge pull request #24186 from libingyang-zte/master
Gregory Farnum [Mon, 15 Oct 2018 21:17:36 +0000 (14:17 -0700)]
Merge pull request #24186 from libingyang-zte/master

doc: fix typos in doc/releases

6 years agoMerge pull request #24238 from LenzGr/doc-leads-update
Gregory Farnum [Mon, 15 Oct 2018 21:15:43 +0000 (14:15 -0700)]
Merge pull request #24238 from LenzGr/doc-leads-update

doc/dev: Updated component leads table

6 years agocrimson/net: clean seastar-msgr class dependencies 24576/head
Yingxin [Mon, 15 Oct 2018 09:46:55 +0000 (17:46 +0800)]
crimson/net: clean seastar-msgr class dependencies

Remove protocol-specific interfaces from Messenger/Connection classes,
and let SocketMessenger manage SocketConnection instead of Connection.

Signed-off-by: Yingxin <yingxin.cheng@intel.com>
6 years agoosd: do not authenticate heartbeat connections until nautilus 24095/head
Sage Weil [Mon, 15 Oct 2018 12:35:24 +0000 (07:35 -0500)]
osd: do not authenticate heartbeat connections until nautilus

Some (currently) pre-nautilus OSDs will crash if you try to authenticate
a heartbeat connection but they are not expecting it:

src/auth/Crypto.h: 109: FAILED assert(ckh)
 ceph version 12.2.8-457-gccd69ef (ccd69ef36aafebab964a2e47e249fdb95e083e46) luminous (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x5618b04aaea0]
 2: (()+0x41cbec) [0x5618afe2cbec]
 3: (CephxSessionHandler::_calc_signature(Message*, unsigned long*)+0x8c5) [0x5618b0777ba5]
 4: (CephxSessionHandler::check_message_signature(Message*)+0x7d) [0x5618b077800d]
 5: (AsyncConnection::process()+0x1b44) [0x5618b0761a04]
 6: (EventCenter::process_events(int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0x359) [0x5618b0546079]
 7: (()+0xb38c3e) [0x5618b0548c3e]
 8: (()+0xb5070) [0x7ff04faf5070]
 9: (()+0x7dd5) [0x7ff050168dd5]
 10: (clone()+0x6d) [0x7ff04f259b3d]

See http://tracker.ceph.com/issues/36443

It won't be fixed in all clusters before upgrade to nautilus, though, so
we also need to work around it here.

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agomsg,osd: enable unauthenticated Dispatcher for pre-nautilus OSD compat
Sage Weil [Wed, 19 Sep 2018 16:44:32 +0000 (11:44 -0500)]
msg,osd: enable unauthenticated Dispatcher for pre-nautilus OSD compat

Before nautilus, osd heartbeats are sent over an unauthenticated channel.
We need support here to allow these connections when they are necessary
for upgrade compatibility.

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agoosd: add missing space in heartbeat debug output
Sage Weil [Thu, 11 Oct 2018 19:28:58 +0000 (14:28 -0500)]
osd: add missing space in heartbeat debug output

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agomgr/DaemonServer: add missing return
Sage Weil [Fri, 21 Sep 2018 14:10:22 +0000 (09:10 -0500)]
mgr/DaemonServer: add missing return

Fixes: http://tracker.ceph.com/issues/36110
Signed-off-by: Sage Weil <sage@redhat.com>
6 years agomsg/async: drop verify_authorizer wrapper
Sage Weil [Thu, 20 Sep 2018 21:19:15 +0000 (16:19 -0500)]
msg/async: drop verify_authorizer wrapper

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agoosd: authenticate ping sessions
Sage Weil [Wed, 19 Sep 2018 16:35:42 +0000 (11:35 -0500)]
osd: authenticate ping sessions

Do not set up a Session object, though--nobody cares (currently!).

This avoids having to special-case the generic authorizer validation
code in msg/* to have to handle non-authenticated sessions.  Also, it
seems like a good idea to authenticate these sessions!

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agomsg/simple: remove verify_authorizer wrapper
Sage Weil [Wed, 19 Sep 2018 15:52:45 +0000 (10:52 -0500)]
msg/simple: remove verify_authorizer wrapper

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agomsg: remove unused ms_verify_authorizer
Sage Weil [Thu, 13 Sep 2018 19:38:21 +0000 (14:38 -0500)]
msg: remove unused ms_verify_authorizer

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agomsg/async: remove get_authorizer wrapper
Sage Weil [Thu, 13 Sep 2018 19:32:42 +0000 (14:32 -0500)]
msg/async: remove get_authorizer wrapper

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agomsg/simple: remove get_authorizer wrapper
Sage Weil [Thu, 13 Sep 2018 19:29:59 +0000 (14:29 -0500)]
msg/simple: remove get_authorizer wrapper

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agomsg/Messenger: pull authenticator validation into Messenger
Sage Weil [Thu, 13 Sep 2018 19:21:04 +0000 (14:21 -0500)]
msg/Messenger: pull authenticator validation into Messenger

This code is essentially identical across the OSD and MDS.  The
monitor is annoyingly different, but in a msgr1 specific way that
we can handle carrying here until msgr1 gets ripped out in a
couple years.

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agoqa/osd: fixup osd-rep-recov-eio.sh fails to parse pg dump 24579/head
huanwen ren [Mon, 15 Oct 2018 17:47:07 +0000 (01:47 +0800)]
qa/osd: fixup osd-rep-recov-eio.sh fails to parse pg dump

Fixes: http://tracker.ceph.com/issues/36418
Signed-off-by: huanwen ren <ren.huanwen@zte.com.cn>
6 years agoMerge pull request #24577 from p-na/fix-saves-invalid-config
Lenz Grimmer [Mon, 15 Oct 2018 16:01:41 +0000 (18:01 +0200)]
Merge pull request #24577 from p-na/fix-saves-invalid-config

dashboard/mgr: Save button doesn't prevent saving an invalid form

Reviewed-by: Ricardo Marques <rimarques@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
6 years agocommon/condition_variable_debug: fix wait hooks 24528/head
Sage Weil [Mon, 15 Oct 2018 15:26:31 +0000 (10:26 -0500)]
common/condition_variable_debug: fix wait hooks

The post_lock hook needs to be called even with the cond times out, because
the lock is still locked at that point.

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agoremoved warning for resolved errata 24562/head
Scoots Hamilton [Mon, 15 Oct 2018 15:18:43 +0000 (11:18 -0400)]
removed warning for resolved errata

Signed-off-by: Scoots Hamilton <scoots@redhat.com>
6 years agomgr/devicehealth: use is_valid_daemon_name helper 24151/head
Sage Weil [Fri, 12 Oct 2018 21:34:59 +0000 (16:34 -0500)]
mgr/devicehealth: use is_valid_daemon_name helper

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agomgr/devicehealth: generalize to mon and osd daemons
Sage Weil [Tue, 18 Sep 2018 21:48:48 +0000 (16:48 -0500)]
mgr/devicehealth: generalize to mon and osd daemons

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agomon: implement 'smart [devid]' tell command
Sage Weil [Tue, 18 Sep 2018 21:47:56 +0000 (16:47 -0500)]
mon: implement 'smart [devid]' tell command

Just like the OSD!

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agomgr: parse mon metadata properly
Sage Weil [Tue, 18 Sep 2018 19:09:28 +0000 (14:09 -0500)]
mgr: parse mon metadata properly

Identify the "device_ids" like we do with OSD metadata by using the
helper.

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agomon: report device id used by mon
Sage Weil [Tue, 18 Sep 2018 19:08:41 +0000 (14:08 -0500)]
mon: report device id used by mon

This will feed into the same device tracking that OSDs currently use.

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agocommon/blkdev: add get_device_by_path
Sage Weil [Tue, 18 Sep 2018 19:29:21 +0000 (14:29 -0500)]
common/blkdev: add get_device_by_path

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agocommon/blkdev: migrate block_device_run_smartctl from OSD.cc
Sage Weil [Tue, 18 Sep 2018 19:20:10 +0000 (14:20 -0500)]
common/blkdev: migrate block_device_run_smartctl from OSD.cc

Slight change in behavior here in that we return the error message in
the result.

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agocommon/mutex_debug: remove no-op before/after hooks
Sage Weil [Mon, 15 Oct 2018 13:56:17 +0000 (08:56 -0500)]
common/mutex_debug: remove no-op before/after hooks

These used to be for the timing instrumentation.

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agocommon/mutex_debug: do lockdep post-lock step in caller
Sage Weil [Mon, 15 Oct 2018 13:55:39 +0000 (08:55 -0500)]
common/mutex_debug: do lockdep post-lock step in caller

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agoMerge PR #24473 into master
Sage Weil [Mon, 15 Oct 2018 13:42:50 +0000 (08:42 -0500)]
Merge PR #24473 into master

* refs/pull/24473/head:
common: drop get_contiguous() from ceph::bufferlist.

Reviewed-by: Sage Weil <sage@redhat.com>
6 years agoMerge PR #24493 into master
Sage Weil [Mon, 15 Oct 2018 13:36:18 +0000 (08:36 -0500)]
Merge PR #24493 into master

* refs/pull/24493/head:
mgr/DaemonState: clean up device life_expectancy values
mgr/devicehealth: warn based on life_expectancy_max
mgr/devicehealth: warn on failing devices at 6 weeks

Reviewed-by: John Spray <john.spray@redhat.com>
6 years agoMerge pull request #24523 from s0nea/wip-dashboard-configs-table-cleanup
Lenz Grimmer [Mon, 15 Oct 2018 13:14:52 +0000 (15:14 +0200)]
Merge pull request #24523 from s0nea/wip-dashboard-configs-table-cleanup

mgr/dashboard: config options table cleanup

Reviewed-by: Ricardo Marques <rimarques@suse.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
6 years agomgr/dashboard: config options table cleanup 24523/head
Tatjana Dehler [Wed, 10 Oct 2018 14:12:32 +0000 (16:12 +0200)]
mgr/dashboard: config options table cleanup

Remove columns 'tags', 'enum_values', 'long_desc', 'type', 'flags',
'daemon_default', 'desc', 'level', 'can_update_at_runtime', 'services',
'max', 'see_also', 'min' and 'source' from table view and add them to
the details.
The table contains 'name', 'value' and 'default' only.

Fixes: http://tracker.ceph.com/issues/34533
Signed-off-by: Tatjana Dehler <tdehler@suse.com>
6 years agopybind/ceph_argparse.py: do not create file for validating CephFilepath 24578/head
Kefu Chai [Mon, 15 Oct 2018 08:55:17 +0000 (16:55 +0800)]
pybind/ceph_argparse.py: do not create file for validating CephFilepath

before this change, a file is opened with append mode for validating the
specified CephFilepath argument. but this file is never removed after
the validation. we cannot assume that this file will be overwritten or
removed by the Ceph daemon who serves the command.

after this change, no file is created for the validation, instead we
check if the file is readable or writable.

Signed-off-by: Kefu Chai <kchai@redhat.com>
6 years agoMerge pull request #24560 from sebastian-philipp/orchestrator-fix-rook-cluster-in...
John Spray [Mon, 15 Oct 2018 08:23:39 +0000 (09:23 +0100)]
Merge pull request #24560 from sebastian-philipp/orchestrator-fix-rook-cluster-in-name

mgr/rook: Fix Rook cluster name detection

Reviewed-by: John Spray <john.spray@redhat.com>
6 years agomgr/dashboard: Save button doesn't prevent saving an invalid form 24577/head
Patrick Nawracay [Sat, 13 Oct 2018 20:58:36 +0000 (22:58 +0200)]
mgr/dashboard: Save button doesn't prevent saving an invalid form

Fixes: http://tracker.ceph.com/issues/36426
Signed-off-by: Patrick Nawracay <pnawracay@suse.com>
6 years agoMerge PR #24494 into master
Sage Weil [Sun, 14 Oct 2018 18:11:11 +0000 (13:11 -0500)]
Merge PR #24494 into master

* refs/pull/24494/head:
ceph-kvstore-tool: rename repair -> destructive-repair

Reviewed-by: Neha Ojha <nojha@redhat.com>
6 years agoos/bluestore: {Mutex,Cond} -> ceph::{mutex,condition_variable}
Sage Weil [Wed, 10 Oct 2018 17:58:33 +0000 (12:58 -0500)]
os/bluestore: {Mutex,Cond} -> ceph::{mutex,condition_variable}

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agoos/bluestore: std::recursive_mutex -> ceph::recursive_mutex
Sage Weil [Wed, 10 Oct 2018 17:38:06 +0000 (12:38 -0500)]
os/bluestore: std::recursive_mutex -> ceph::recursive_mutex

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agoos/bluestore: re-add is_locked assert
Sage Weil [Wed, 10 Oct 2018 17:57:41 +0000 (12:57 -0500)]
os/bluestore: re-add is_locked assert

We have this now with ceph::mutex

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agoos/bluestore: std::{mutex,condition_variable} -> ceph::{mutex,condition_variable}
Sage Weil [Wed, 10 Oct 2018 17:37:46 +0000 (12:37 -0500)]
os/bluestore: std::{mutex,condition_variable} -> ceph::{mutex,condition_variable}

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agomsg/Messenger: uninline ms_deliver_verify_authorizer
Sage Weil [Thu, 13 Sep 2018 19:14:43 +0000 (14:14 -0500)]
msg/Messenger: uninline ms_deliver_verify_authorizer

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agomgr/DaemonServer: expose keyring for authenticator verification
Sage Weil [Thu, 13 Sep 2018 19:27:49 +0000 (14:27 -0500)]
mgr/DaemonServer: expose keyring for authenticator verification

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agomon: expose keyring for msgr1 authentication
Sage Weil [Thu, 13 Sep 2018 19:05:08 +0000 (14:05 -0500)]
mon: expose keyring for msgr1 authentication

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agomds: expose keyring for authenticater verification
Sage Weil [Thu, 13 Sep 2018 19:03:21 +0000 (14:03 -0500)]
mds: expose keyring for authenticater verification

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agoosd: expose keyring for authenticater verification
Sage Weil [Thu, 13 Sep 2018 19:02:57 +0000 (14:02 -0500)]
osd: expose keyring for authenticater verification

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agomsg/Dispatcher: add ms_get_auth1_authorizer_keystore
Sage Weil [Thu, 13 Sep 2018 19:02:34 +0000 (14:02 -0500)]
msg/Dispatcher: add ms_get_auth1_authorizer_keystore

This is there to provide the keyring used for authenticating msgr1
authorizers.

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agomon: fix ref cycle breakage in handle_forward
Sage Weil [Thu, 13 Sep 2018 19:00:44 +0000 (14:00 -0500)]
mon: fix ref cycle breakage in handle_forward

We now rely on the session -> connection ref for printing
remote addr, peer_global_id, and so on.  Change this code to
break the ref cycle instead by removing the con->session link,
which is only needed by the MonOpRequest ctor called at the top
of _ms_dispatch.

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agomon: use MonOpRequest get_session() instead of PaxosServiceMessage's
Sage Weil [Tue, 9 Oct 2018 22:08:46 +0000 (17:08 -0500)]
mon: use MonOpRequest get_session() instead of PaxosServiceMessage's

The PaxosServiceMessage method relies on the msg -> con -> session linkage,
and the con -> session link is not present for forwarded messages.  Also,
the message path is redundant and unnecessary.

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agomon: get session from MonOpRequest in handle_command
Sage Weil [Thu, 13 Sep 2018 18:59:25 +0000 (13:59 -0500)]
mon: get session from MonOpRequest in handle_command

We should have made this switchover a long time ago.

Signed-off-by: Sage Weil <sage@redhat.com>
6 years agomessages/MForward: drop unused ctor
Sage Weil [Tue, 9 Oct 2018 22:08:56 +0000 (17:08 -0500)]
messages/MForward: drop unused ctor

Signed-off-by: Sage Weil <sage@redhat.com>