]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
7 years ago12.1.4 v12.1.4
Jenkins Build Slave User [Tue, 15 Aug 2017 13:45:11 +0000 (13:45 +0000)]
12.1.4

7 years agoMerge pull request #17001 from gregsfortytwo/wip-20985-divergent-handling-luminous
Gregory Farnum [Mon, 14 Aug 2017 13:54:36 +0000 (06:54 -0700)]
Merge pull request #17001 from gregsfortytwo/wip-20985-divergent-handling-luminous

Wip 20985 divergent handling luminous

7 years agomgr: implement 'osd safe-to-destroy' and 'ok-to-stop' commands
Sage Weil [Thu, 10 Aug 2017 18:06:02 +0000 (14:06 -0400)]
mgr: implement 'osd safe-to-destroy' and 'ok-to-stop' commands

An osd is safe to destroy if

- we have osd_stat for it
- osd_stat indicates no pgs stored
- all pgs are known
- no pgs map to it

An osd is ok ot stop if

- we have pg stats
- no pgs will drop below min_size

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit bf9380457bba5a834ffa2927c73165e0f1960332)

7 years agoosd/osd_types: include number of locally stored PGs in osd_stat_t
Sage Weil [Thu, 10 Aug 2017 18:05:10 +0000 (14:05 -0400)]
osd/osd_types: include number of locally stored PGs in osd_stat_t

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit c2945479f4364ba599262e00cbf200dff66617bb)

7 years agoosd/OSDMap: add parse_osd_id_list helper
Sage Weil [Thu, 10 Aug 2017 22:13:40 +0000 (18:13 -0400)]
osd/OSDMap: add parse_osd_id_list helper

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 6fc33a046ba880e42302c3ff54b69747df47b9a2)

7 years agocrush/CrushWrapper: fixing timing of removal in remove_item_under
Sage Weil [Sat, 12 Aug 2017 19:19:45 +0000 (15:19 -0400)]
crush/CrushWrapper: fixing timing of removal in remove_item_under

Do it after we reweight, not before

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 46ded6eb5b099db783e6ff4a1baa44eb1300e9c3)

7 years agocrush/CrushWrapper: fix iterator invalidation in cleanup_dead_classes
Sage Weil [Sat, 12 Aug 2017 18:45:42 +0000 (14:45 -0400)]
crush/CrushWrapper: fix iterator invalidation in cleanup_dead_classes

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 061c21786adc92a304467d1160f26ca09718b03f)

7 years agocrush/CrushWrapper: keep weights and/or ids null if empty
Sage Weil [Wed, 9 Aug 2017 21:27:49 +0000 (17:27 -0400)]
crush/CrushWrapper: keep weights and/or ids null if empty

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit b3838c83d501bf2f40bb799300cde049123b0eb0)

7 years agocrush/builder: fix ENOENT when removing last bucket item
Sage Weil [Wed, 9 Aug 2017 21:25:12 +0000 (17:25 -0400)]
crush/builder: fix ENOENT when removing last bucket item

We were decrementing size and then breaking out ENOENT condition check.
Fix by decrementing size only after we break out of the loop and verify
we found the item.

Fix a follow-on bug by avoiding realloc when we have 0 items left.  This case
was never exercised before due to the ENOENT issue; now we return explicitly.
It's really not necessary to realloc at all, probably, since these are very
small arrays, but in any case leaving a single item allocation there in place of
a 0-length allocation is fine.  (And the 0-length allocation behvaior on realloc
is undefined.. may either return a pointer or NULL.)

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 5a982050c1daddee33bb2b5d63a5936723283765)

7 years agoqa/workunits/mon/crush_ops.sh: test weight sets vs device classes
Sage Weil [Mon, 7 Aug 2017 22:30:39 +0000 (18:30 -0400)]
qa/workunits/mon/crush_ops.sh: test weight sets vs device classes

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 9c42597f096a3a08a552a7a14c6e5da415cf0037)

7 years agomon/OSDMonitor: remove choose_args when pool is removed
Sage Weil [Mon, 7 Aug 2017 22:26:09 +0000 (18:26 -0400)]
mon/OSDMonitor: remove choose_args when pool is removed

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 49c468b78849b09ad3db2f6028ebf588c6fd2ac3)

7 years agocrush/CrushWrapper: fill in weight-sets when we build shadow trees
Sage Weil [Mon, 7 Aug 2017 21:56:06 +0000 (17:56 -0400)]
crush/CrushWrapper: fill in weight-sets when we build shadow trees

When we build the shadow buckets for the class hierarchies, we need
to fill in the weight-sets for each shadow bucket too.

Skip the ids vector for now since it's not yet used by anything.

Fixes: http://tracker.ceph.com/issues/20939
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 0ed55e6150e3d82b2955cf3e9fa0f01b36a6474a)

7 years agocrush/CrushWrapper: remove unused 'unused' arg for trim_roots_with_classes
Sage Weil [Mon, 7 Aug 2017 20:54:11 +0000 (16:54 -0400)]
crush/CrushWrapper: remove unused 'unused' arg for trim_roots_with_classes

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 2649b27d13d364c6533d514b0d0df53fd6519551)

7 years agocrush: do add/remove before updating weight-sets
Sage Weil [Mon, 7 Aug 2017 20:50:44 +0000 (16:50 -0400)]
crush: do add/remove before updating weight-sets

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 1f62ee8408d8cff91df5fd23f2ddfd58197a5b10)

7 years agoosd: Fix Paxos shutdown handling for commit_finish race
David Zafman [Mon, 7 Aug 2017 19:48:27 +0000 (12:48 -0700)]
osd: Fix Paxos shutdown handling for commit_finish race

Fixes: http://tracker.ceph.com/issues/20921
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit c1404574895bc5a091f36b8119c7500fe1e287f1)

7 years agoqa/suites/rados/verify/validater/valgrind: whitelist PG_
Sage Weil [Sat, 12 Aug 2017 18:18:59 +0000 (14:18 -0400)]
qa/suites/rados/verify/validater/valgrind: whitelist PG_

Peering might be slow due to valgrind.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 41e5a85308ba71caf7de0e516811ee3b1228b310)

7 years agoqa/suites/rados/multimon/tasks/mon_lock_with_skew: whitelist PG_
Sage Weil [Sat, 12 Aug 2017 18:15:15 +0000 (14:15 -0400)]
qa/suites/rados/multimon/tasks/mon_lock_with_skew: whitelist PG_

Default pool pgs not up because mons too broken for OSDs to peer.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 12007044b1a039bd7d22e5a6c0253263bdaa0419)

7 years agoMerge pull request #17003 from liewegas/wip-luminous-squelch-mon-down
Sage Weil [Sat, 12 Aug 2017 18:16:08 +0000 (13:16 -0500)]
Merge pull request #17003 from liewegas/wip-luminous-squelch-mon-down

qa/tasks/thrashosds-health.yaml: ignore MON_DOWN

7 years agoqa/tasks/thrashosds-health.yaml: ignore MON_DOWN 17003/head
Sage Weil [Sat, 12 Aug 2017 18:08:01 +0000 (14:08 -0400)]
qa/tasks/thrashosds-health.yaml: ignore MON_DOWN

See http://tracker.ceph.com/issues/20910

It's not clear why this is happening (it could just be load on the
test machines) but it's very noisy, so silencing it for now on
the luminous branch.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoos/bluestore: fail early on very large objects
Sage Weil [Tue, 8 Aug 2017 22:12:57 +0000 (18:12 -0400)]
os/bluestore: fail early on very large objects

We have a hard 4GB object size limit (although in practice we want
to be *well* below that!).

See http://tracker.ceph.com/issues/20923
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 70f6760d3793d5a06e3c583608d220b354e65f84)

7 years agoos/bluestore: do not segv on kraken upgrade debug print
Sage Weil [Fri, 11 Aug 2017 15:58:42 +0000 (11:58 -0400)]
os/bluestore: do not segv on kraken upgrade debug print

When loading an onode from kraken we have a compat path that calls
get_ref before the SharedBlob pointer is initialized.  This is fine except
that if debugging is enabled the operator<< on the Blob will segv on
printing *b.shared_blob (which is NULL).

Fix operator<< to print something else if it is NULL.  shared_blob does
get set up right after the call to decode() so having it be NULL at this
point is otherwise harmless.

Fixes: http://tracker.ceph.com/issues/20977
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 50523a225b2f415b8c3d82faeae59c0eede54663)

7 years agoos/bluestore: fix clone dirty_range again
Sage Weil [Fri, 11 Aug 2017 16:46:09 +0000 (12:46 -0400)]
os/bluestore: fix clone dirty_range again

If we are cloning a blob for a 1 byte logical extent then dirty_range_begin
will equal _end and we won't dirty the source onode (with possibly newly
shared blobs).

Fix by using a separate flag to indicate whether we are dirtying instead
of overloading the begin/end markers for this.  Note that even if they
are equal dirty_range will still dirty the shard in question.

This is a result of 0ae5d92d42500e5ab08253a00bda47b957767ebc.

Fixes: http://tracker.ceph.com/issues/20983
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit d5ba7061ee588c232138af1d880faf09be4adeed)

7 years agoos/memstore: memstore_page_set=false
Sage Weil [Fri, 11 Aug 2017 17:31:51 +0000 (13:31 -0400)]
os/memstore: memstore_page_set=false

This regularly returns bad results, see http://tracker.ceph.com/issues/20738

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 7c96868797702eb3567a7cf7aa0e0a7ce422612c)

7 years agoqa/suites/rados/multimon: whitelist mgr down vs clock skew test
Sage Weil [Fri, 11 Aug 2017 17:42:02 +0000 (13:42 -0400)]
qa/suites/rados/multimon: whitelist mgr down vs clock skew test

Clock skew might make us fail the mgr.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit ad23d7dc1fb21cbb330ad639d3a41613dc2a9c43)

7 years agomon: correctly print out mds versions (instead of mon ones) 17001/head
Greg Farnum [Fri, 11 Aug 2017 22:31:43 +0000 (15:31 -0700)]
mon: correctly print out mds versions (instead of mon ones)

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit 6f7545e0b58c3e25aff386f179c0fb77bb96adc2)

7 years agoosd: be more precise about our asserts and cases when rebuilding missing sets and...
Greg Farnum [Fri, 11 Aug 2017 22:10:01 +0000 (15:10 -0700)]
osd: be more precise about our asserts and cases when rebuilding missing sets and handling divergent priors

Fixes: http://tracker.ceph.com/issues/20985
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit a9921a2fdaab823bf348ae7fc773647c9664952d)

7 years agoMerge commit 'c56d9c07b342c08419bbc18dcf2a4c5fae62b9cf' into luminous
Sage Weil [Fri, 11 Aug 2017 19:43:41 +0000 (15:43 -0400)]
Merge commit 'c56d9c07b342c08419bbc18dcf2a4c5fae62b9cf' into luminous

7 years ago12.1.3
Jenkins Build Slave User [Thu, 10 Aug 2017 19:22:43 +0000 (19:22 +0000)]
12.1.3

7 years agorados/tool: fixup rados stat command hint
huanwen ren [Fri, 11 Aug 2017 02:50:59 +0000 (10:50 +0800)]
rados/tool: fixup rados stat command hint

Signed-off-by: huanwen ren <ren.huanwen@zte.com.cn>
(cherry picked from commit 85f3feb912464dcedc888396b4d1f9b23d19cb82)

7 years agocrush: "osd crush class rename" support
xie xingguo [Thu, 10 Aug 2017 09:28:04 +0000 (17:28 +0800)]
crush: "osd crush class rename" support

In 076a6abd80cc90ebcb901f908f880ef030721b2a I killed the 'class rename' command
and thought it was totally useless but I was wrong.

Consider the following user case:
(1) randomly choose some OSDs(e.g., from different hosts) and try to make them for private use only,
    say, by grouping them into 'pool1'
(2) ceph osd crush set-device-class pool1 'OSDs from (1)'
(3) ceph osd crush rule create-replicated rule_for_pool1 default host pool1
(4) ceph osd pool rename pool1 pool2
(5) ceph osd crush class rename pool1 pool2

From the above user case, we need to safely change a pool name without worrying
any risk of data migration. That is why the 'osd crush class rename' command
is still needed here.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit d792e8d528768eeccc6ad71af5aa3fd81780eb2e)

7 years agoceph_test_objectstore: drop expect regex
Sage Weil [Thu, 10 Aug 2017 14:41:47 +0000 (10:41 -0400)]
ceph_test_objectstore: drop expect regex

If logging is enabled (as it now is in teuthology) this won't match the
forked output.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 863468803de3f92f9bada6bffd010a204ee996d7)

7 years agoqa: test that "fs new" correctly set the application_metadata
Greg Farnum [Thu, 10 Aug 2017 17:28:09 +0000 (10:28 -0700)]
qa: test that "fs new" correctly set the application_metadata

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit c85af7b14681be916cea1119f3f9cdf556e2707a)

7 years agomdsmon: treat the osdmon correctly when doing plugged updates
Greg Farnum [Wed, 9 Aug 2017 21:34:44 +0000 (14:34 -0700)]
mdsmon: treat the osdmon correctly when doing plugged updates

Make sure it's writeable before invoking changes, and propose_pending()
on it when we're done.
Make the PaxosService::C_RetryMessage public so we can do this from FSCommands.

Maybe-
Fixes: http://tracker.ceph.com/issues/20959
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit 435717791ec499f71c9d1485b1e4e63239a343e2)

7 years agomdsmon: don't add pool application metadata until running fully-luminous
Greg Farnum [Wed, 9 Aug 2017 20:46:30 +0000 (13:46 -0700)]
mdsmon: don't add pool application metadata until running fully-luminous

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit bcd3554bc5a3965077af655498fd434910f13040)

7 years ago12.1.3 v12.1.3
Jenkins Build Slave User [Thu, 10 Aug 2017 19:22:43 +0000 (19:22 +0000)]
12.1.3

7 years agoMerge pull request #16969 from linuxbox2/wip-disable-dynreshard-L
Sage Weil [Thu, 10 Aug 2017 18:41:40 +0000 (13:41 -0500)]
Merge pull request #16969 from linuxbox2/wip-disable-dynreshard-L

rgw: disable dynamic reshading for 1st L point release

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
7 years agorgw: disable dynamic reshading for 1st L point release 16969/head
Matt Benjamin [Thu, 10 Aug 2017 14:40:25 +0000 (10:40 -0400)]
rgw: disable dynamic reshading for 1st L point release

Temporarily default RGW dynamic bucket resharding to OFF (usability
standup).

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
7 years agoos/bluestore: clearer comments, not slower code.
Mark Nelson [Mon, 7 Aug 2017 14:23:12 +0000 (09:23 -0500)]
os/bluestore:  clearer comments, not slower code.

Signed-off-by: Mark Nelson <mnelson@redhat.com>
(cherry picked from commit da92220d9d6936ef1aa19333038b7a1d982485a4)

7 years agomon/Elector: force election epoch bump on start
Sage Weil [Tue, 8 Aug 2017 22:43:22 +0000 (18:43 -0400)]
mon/Elector: force election epoch bump on start

We are generally careful when bumping the epoch so that we can join
existing rounds.  However, if we restart in the middle of an election,
and change versions, we need to be certain that our previous ACK (as
$version - 1) isn't accepted as truth for the restarted daemon (running
$version) keeping the same epoch.

The conservatism with bumping is to avoid spurious election cycles, but
mon restarts are more rare, and we need them here.

Fixes: http://tracker.ceph.com/issues/20949
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit ef425374250014393c1d432a3eda95179bb70537)

7 years agoqa/suites/upgrade/kraken-x/stress-split: more whitelisting
Sage Weil [Wed, 9 Aug 2017 13:11:05 +0000 (09:11 -0400)]
qa/suites/upgrade/kraken-x/stress-split: more whitelisting

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit b61be07d45e7cc1b1f4b606408bf4ed04853fb9b)

7 years agoMerge pull request #16970 from ceph/backport-16919
Alfredo Deza [Thu, 10 Aug 2017 15:58:15 +0000 (11:58 -0400)]
Merge pull request #16970 from ceph/backport-16919

Backport: "ceph-volume: adds functional CI testing #16919"

Reviewed-by: Alfredo Deza <adeza@redhat.com>
7 years agoceph-volume: is_mounted should use a bytes->string util to compare strings 16970/head
Alfredo Deza [Thu, 10 Aug 2017 13:11:58 +0000 (09:11 -0400)]
ceph-volume: is_mounted should use a bytes->string util to compare strings

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit 4bb4c432dc17139b4f891d0fda229007b4913c0d)

7 years agoceph-volume: create a utf-8 string decoder for py3 compat
Alfredo Deza [Thu, 10 Aug 2017 13:11:27 +0000 (09:11 -0400)]
ceph-volume: create a utf-8 string decoder for py3 compat

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit b50f1fb5527447c323fa51fb5c0ba260655dfb0d)

7 years agoceph-volume: tests add tests for the is_mounted utility
Alfredo Deza [Wed, 9 Aug 2017 19:56:53 +0000 (15:56 -0400)]
ceph-volume: tests add tests for the is_mounted utility

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit dd4db2f5671c3ce0aded73013b64a0cc559e5502)

7 years agoceph-volume: lvm activate should check if the device is mounted to prevent errors...
Alfredo Deza [Wed, 9 Aug 2017 19:24:15 +0000 (15:24 -0400)]
ceph-volume: lvm activate should check if the device is mounted to prevent errors from mount

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit c61aea41f1d07b824e169bf12328b7eb0055e23f)

7 years agoceph-volume util add a helper to check if a device is mounted
Alfredo Deza [Wed, 9 Aug 2017 19:10:18 +0000 (15:10 -0400)]
ceph-volume util add a helper to check if a device is mounted

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit d77d86aae11fba01834bb8d60633f3f49126c783)

7 years agoceph-volume: lvm activate should not ignore exit status codes
Alfredo Deza [Wed, 9 Aug 2017 12:20:33 +0000 (08:20 -0400)]
ceph-volume: lvm activate should not ignore exit status codes

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit c866123017a1defac249bebe76cc7bbaddf3cf67)

7 years agoceph-volume: remove unused config from vagrant_variables.yml files
Andrew Schoen [Tue, 8 Aug 2017 17:43:53 +0000 (12:43 -0500)]
ceph-volume: remove unused config from vagrant_variables.yml files

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 855ce630695ed9ca53c314b7e261ec3cc499787d)

7 years agoceph-volume: adds CEPH_VOLUME_DEBUG=1 to functional tests
Andrew Schoen [Tue, 8 Aug 2017 17:24:41 +0000 (12:24 -0500)]
ceph-volume: adds CEPH_VOLUME_DEBUG=1 to functional tests

This will show us tracebacks if ceph-volume fails

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 5a90f4c577bf371a36bf602dc8ea01663aaffe00)

7 years agoceph-volume: add placeholders for prepare_activate testing in tox.ini
Andrew Schoen [Tue, 8 Aug 2017 17:05:40 +0000 (12:05 -0500)]
ceph-volume: add placeholders for prepare_activate testing in tox.ini

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 852a94734f69bfb5544e2d3af34b9e71057df851)

7 years agoceph-volume: adds the xenial distro factor
Andrew Schoen [Tue, 8 Aug 2017 16:44:22 +0000 (11:44 -0500)]
ceph-volume: adds the xenial distro factor

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 1b4275417d457e56eb0c3ac07597447cad7737ae)

7 years agoceph-volume: create a centos7 factor for functional testing
Andrew Schoen [Tue, 8 Aug 2017 16:41:15 +0000 (11:41 -0500)]
ceph-volume: create a centos7 factor for functional testing

We want to run these tests on multiple distros so this change sets the
foundation for that.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit c8e3be6faed1c062826117bc1355b7b752c01bd4)

7 years agoceph-volume: vagrantfile runs storagectl once
Alfredo Deza [Tue, 8 Aug 2017 15:07:02 +0000 (11:07 -0400)]
ceph-volume: vagrantfile runs storagectl once

It assumes that if there is a disk left it has already run. This avoids
issues when reloading/restarting machines with vagrant.

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit 476d1f50b82f81addd0de218c57b773f81883b0f)

7 years agoceph-volume: setup nodes for testinfra testing
Andrew Schoen [Tue, 8 Aug 2017 15:16:12 +0000 (10:16 -0500)]
ceph-volume: setup nodes for testinfra testing

This uses the playbook that exists in ceph-ansible to prepare the nodes for
testing by installing net-tools.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 97b216fdd1657b34db93b264ce814a8f72434d7b)

7 years agoceph-volume: tox define vagrant cwd
Alfredo Deza [Mon, 7 Aug 2017 20:34:22 +0000 (16:34 -0400)]
ceph-volume: tox define vagrant cwd

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit 89ccbd8ab4f10832c6bb7e3660e00cce62af4a6b)

7 years agoceph-volume: adds a functional testing scenario for lvm create
Andrew Schoen [Fri, 4 Aug 2017 15:40:39 +0000 (10:40 -0500)]
ceph-volume: adds a functional testing scenario for lvm create

This setups up the basic test harness and adds a test for the create
subcommand. The test uses ceph-ansible to deploy a cluster using
``ceph-volume lvm create``, tests the cluster state using the
ceph-ansible test suite, reboots the nodes and then tests again.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 750d9f4125783b29a7af44acb2c3caa43bee707a)

7 years agoRevert "qa/suites/upgrade/jewel-x/parallel: thrash layout"
Sage Weil [Thu, 10 Aug 2017 13:50:54 +0000 (09:50 -0400)]
Revert "qa/suites/upgrade/jewel-x/parallel: thrash layout"

This reverts commit cabd44af3503c368160fef7e56b637dfbf0e9921.  This test
combinatoin is not yet stable.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoqa/tasks/ceph.py: tolerate flush pg stats exception
Sage Weil [Tue, 8 Aug 2017 16:08:31 +0000 (12:08 -0400)]
qa/tasks/ceph.py: tolerate flush pg stats exception

If the OSD doesn't see IO, it won't flush more pg/osd stats when the
luminous flag is not yet set (legacy pgmonitor mode).

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 9da7e63c641d343853ed83bb22490cb6af1f3d6d)

7 years agoqa/suites/upgrade/jewel-x/parallel: thrash layout
Sage Weil [Wed, 9 Aug 2017 20:40:43 +0000 (16:40 -0400)]
qa/suites/upgrade/jewel-x/parallel: thrash layout

We can't kill and restart osds because that will interfere with
the upgrade process.  We can, however, thrash the layout by
tweaking osd weights and so on.  This will exercise osd recovery
paths during the upgrade that aren't normally exercised (outside
of stress-split..which doesn't upgrade individual osds while they
are non-clean).

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 435777dbffc77c93d06476caf83be141359a5778)

7 years agoosd/PG: force rebuild of missing set on jewel upgrade
Sage Weil [Wed, 9 Aug 2017 16:50:57 +0000 (12:50 -0400)]
osd/PG: force rebuild of missing set on jewel upgrade

Previously we were detecting the need to rebuild missing based on
whether the "divergent_priors" omap key was present.  Unfortunately,
jewel does not always set this, so it is not a reliable indicator.
(It only gets set if you actually have a divergent prior at some
point in the PG's life time on that OSD.)

Fix by using the info_struct_v on the PG to detect whether we need
to do the conversion.  We didn't bump the value when we adding
the missing persistence, but the fastinfo was also added during
the same period between jewel and kraken, so it will work just as
well.

Fixes: http://tracker.ceph.com/issues/20958
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit dd1a25218c5a1def02146edb2dce0d97a71e4436)

7 years agoMerge pull request #16922 from ivancich/luminous-16755
Gregory Farnum [Wed, 9 Aug 2017 18:29:00 +0000 (11:29 -0700)]
Merge pull request #16922 from ivancich/luminous-16755

Merge pull request #16755 from ivancich/wip-pull-new-dmclock

7 years agomon/OSDMonitor: implement 'osd crush ls <node>'
Sage Weil [Tue, 8 Aug 2017 19:56:18 +0000 (15:56 -0400)]
mon/OSDMonitor: implement 'osd crush ls <node>'

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit aeed87945b7e3af3c76b8e39739725ce6b09ad56)

7 years agoqa/suites/upgarde/jewel-x/point-to-point-x: disable app warnings
Sage Weil [Wed, 9 Aug 2017 13:18:54 +0000 (09:18 -0400)]
qa/suites/upgarde/jewel-x/point-to-point-x: disable app warnings

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit bbd5fe354c507bce7aad0c7c37036a47dbf624a3)

7 years agoMerge pull request #16948 from dillaman/wip-32bit-compat-fixes-luminous
Jason Dillaman [Wed, 9 Aug 2017 16:27:13 +0000 (12:27 -0400)]
Merge pull request #16948 from dillaman/wip-32bit-compat-fixes-luminous

luminous: rbd-mirror: align use of uint64_t in service_daemon::AttributeType

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
7 years agoAlign use of uint64_t in service_daemon::AttributeType 16948/head
James Page [Wed, 9 Aug 2017 09:04:37 +0000 (10:04 +0100)]
Align use of uint64_t in service_daemon::AttributeType

size_t on a 32-bit architecture is a 32 bit unsigned int which
created ambiguity when casting to bool, uint64_t or std::string
(which are boost::variants for service_daemon::AttributeType).

Align to use of uint64_t to resolve compilation failures in
all 32-bit architectures.

Signed-off-by: James Page <james.page@ubuntu.com>
(cherry picked from commit 87fe8e81bc8c9b55c6bef4144714a33e042dc2f7)

7 years agoMerge pull request #16946 from trociny/wip-20954-luminous
Jason Dillaman [Wed, 9 Aug 2017 14:32:51 +0000 (10:32 -0400)]
Merge pull request #16946 from trociny/wip-20954-luminous

luminous: qa/workunits/rbd: use command line option to specify watcher asok

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
7 years agoqa/workunits/rbd: use command line option to specify watcher asok 16946/head
Mykola Golub [Tue, 8 Aug 2017 18:50:47 +0000 (20:50 +0200)]
qa/workunits/rbd: use command line option to specify watcher asok

The previous method to get the watcher admin socket was fragile
and had started to fail after the recent changes to vstart ceph.conf.

Fixes: http://tracker.ceph.com/issues/20954
Signed-off-by: Mykola Golub <mgolub@mirantis.com>
(cherry picked from commit 6a575136a76b9e291c0948e1179f33a2f73853fb)

7 years agoMerge pull request #16943 from theanalyst/wip-luminous-16889
Abhishek L [Wed, 9 Aug 2017 14:03:57 +0000 (16:03 +0200)]
Merge pull request #16943 from theanalyst/wip-luminous-16889

luminous: rgw: Use namespace for lc_pool and roles_pool

Reviewed-By: Orit Wasserman <owasserm@redhat.com>
Reviewed-By: Matt Benjamin <mbenjamin@redhat.com>
7 years agorgw: use namespace for roles pool 16943/head
Orit Wasserman [Tue, 8 Aug 2017 08:24:06 +0000 (11:24 +0300)]
rgw:  use namespace for roles pool

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
(cherry picked from commit 4c378ffbbd984d4c7985415ea067661b7b3a2e98)

7 years agorgw: initialize lc pool as namespace
Orit Wasserman [Tue, 8 Aug 2017 08:22:42 +0000 (11:22 +0300)]
rgw: initialize lc pool as namespace

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
Fixes: http://tracker.ceph.com/issues/20177
(cherry picked from commit 39d76cad38272b8f6db79bdb51a054fa41189b41)

7 years agoRevert "os/bluestore: allow multiple DeferredBatches in flight at once"
Sage Weil [Tue, 8 Aug 2017 13:23:31 +0000 (09:23 -0400)]
Revert "os/bluestore: allow multiple DeferredBatches in flight at once"

This reverts commit ca32d575eb2673737198a63643d5d1923151eba3.

If we have multiple batches in flight then we have to worry about writes
to the same blocks reordering.

Also, 3c6a6c46d5808d6c42ed4dcfb441bad64366686b is sufficient to avoid the
stall/deadlock in http://tracker.ceph.com/issues/20295.

# Conflicts:
# src/os/bluestore/BlueStore.cc

Fixes: http://tracker.ceph.com/issues/20925
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 917858516a904c82fabe1bd65d2fa88436319713)

7 years agomon: add mon_health_preluminous_compat_warning
Sage Weil [Tue, 8 Aug 2017 14:28:10 +0000 (10:28 -0400)]
mon: add mon_health_preluminous_compat_warning

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 7c22b46e907ad239fb9efaee96f3ef424beb220a)

7 years agoosd: downgrade (ok) PG scrub messages to debug
John Spray [Tue, 8 Aug 2017 18:53:11 +0000 (19:53 +0100)]
osd: downgrade (ok) PG scrub messages to debug

Otherwise someone watching the log at INFO level gets
pelted with potentially millions of log messages
while the system is scrubbing.

Fixes: http://tracker.ceph.com/issues/20947
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit 0443fdb0221bbcd33c8d1278fdd3959e710fcacb)

7 years agomon: downgrade "scrub ok" message to debug.
John Spray [Tue, 8 Aug 2017 18:36:03 +0000 (19:36 +0100)]
mon: downgrade "scrub ok" message to debug.

This hides lines like:
[INF]  scrub ok on 0,1,2: ScrubResult(keys {pgmap_pg=13} crc {pgmap_pg=2458062599})

from the normal cluster log views.

Fixes: http://tracker.ceph.com/issues/20947
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit f394ca0bad248e05665a912558382ca2ea560a91)

7 years agoMerge pull request #16921 from dillaman/wip-rbd-ls-luminous
Jason Dillaman [Wed, 9 Aug 2017 01:19:25 +0000 (21:19 -0400)]
Merge pull request #16921 from dillaman/wip-rbd-ls-luminous

luminous: rbd: parallelize rbd ls -l

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
7 years agoMerge pull request #16755 from ivancich/wip-pull-new-dmclock 16922/head
Gregory Farnum [Tue, 8 Aug 2017 21:27:28 +0000 (14:27 -0700)]
Merge pull request #16755 from ivancich/wip-pull-new-dmclock

osd: bring in dmclock library changes

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit 25f1edefbf21f17f5501d9894f0c4979c04b3f08)

7 years agorbd: parallelize rbd ls -l 16921/head
Piotr Dałek [Wed, 7 Jun 2017 14:01:37 +0000 (16:01 +0200)]
rbd: parallelize rbd ls -l

When a cluster contains a large number of images, "rbd ls -l" takes a
long time to finish. In my particular case, it took about 58s to
process 3000 images.
"rbd ls -l" opens each of image and that takes majority of time, so
improve this by using aio_open() and aio_close() to do it
asynchronously. This reduced total processing time down to around 15
seconds when using default 10 concurrently opened images.

Signed-off-by: Piotr Dałek <piotr.dalek@corp.ovh.com>
(cherry picked from commit 8f76fc861b0a628fa2269b04f77b7e31d4a7a006)

7 years agoMerge pull request #16914 from theanalyst/wip-16734
Abhishek L [Tue, 8 Aug 2017 18:53:40 +0000 (20:53 +0200)]
Merge pull request #16914 from theanalyst/wip-16734

luminous: rgw_lc: support for AWSv4 authentication

Reviewed-By: Daniel Gryniewicz <dang@redhat.com>
Reviewed-By: Radoslaw Zarzynski <rzarzynski@redhat.com>
Reviewed-By: Matt Benjamin <mbenjami@redhat.com>
7 years agorgw_lc: support for AWSv4 authentication 16914/head
Abhishek Lekshmanan [Tue, 1 Aug 2017 15:42:31 +0000 (17:42 +0200)]
rgw_lc: support for AWSv4 authentication

adding support for AWSv4 authentication for Put Object LC, also adding
types to all of the LC ops in the process

Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
(cherry picked from commit cc51f32c22f294f59b369350e54b86892015cbab)

7 years agoMerge pull request #16912 from dillaman/wip-20701-luminous
Jason Dillaman [Tue, 8 Aug 2017 17:52:46 +0000 (13:52 -0400)]
Merge pull request #16912 from dillaman/wip-20701-luminous

luminous: doc: update rbd-mirroring documentation

Reviewed-by: Sage Weil <sage@redhat.com>
7 years agodoc/release-notes: indicate that rbd-mirror should use unique IDs 16912/head
Jason Dillaman [Tue, 8 Aug 2017 16:46:56 +0000 (12:46 -0400)]
doc/release-notes: indicate that rbd-mirror should use unique IDs

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 3ecc35350afa8777139a76c9e63b121775c6096f)

7 years agodoc: updated rbd-mirror daemon instructions
Jason Dillaman [Tue, 8 Aug 2017 16:43:32 +0000 (12:43 -0400)]
doc: updated rbd-mirror daemon instructions

Fixes: http://tracker.ceph.com/issues/20701
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit b7ba3f68c3baf5ad8b0b307afcf2bfbfa18d597c)

7 years agodoc: re-ordered rbd table of contents
Jason Dillaman [Tue, 8 Aug 2017 15:53:42 +0000 (11:53 -0400)]
doc: re-ordered rbd table of contents

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 34ff1ddca1d228bb785ec04f3aef6ccfdccdc5de)

7 years agoMerge pull request #16875 from dillaman/wip-lirbd-group-luminous
Jason Dillaman [Tue, 8 Aug 2017 16:57:24 +0000 (12:57 -0400)]
Merge pull request #16875 from dillaman/wip-lirbd-group-luminous

luminous: librbd: remove consistency group rbd cli and API support

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
7 years agoMerge PR #16378 into HEAD
Patrick Donnelly [Tue, 8 Aug 2017 16:40:27 +0000 (09:40 -0700)]
Merge PR #16378 into HEAD

* refs/remotes/upstream/pull/16378/head:
doc: remove accidental additions to release notes
qa/cephfs: Fix race in test_volume_client
qa/cephfs: Test filtered df
PendingReleaseNotes: add note about df filtering
client: Support new, filtered MStatfs
objecter: Support new, filtered MStatfs
mon/PGMap stats: Support new, filtered MStatfs
messages: Add optional data pool to MStatfs

Reviewed-by: John Spray <john.spray@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
7 years agodoc: remove accidental additions to release notes 16378/head
Patrick Donnelly [Tue, 8 Aug 2017 16:28:57 +0000 (09:28 -0700)]
doc: remove accidental additions to release notes

Presumably this was caused by a bad rebase.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
7 years agoMerge pull request #16903 from dillaman/wip-16877-luminous
Jason Dillaman [Tue, 8 Aug 2017 15:23:49 +0000 (11:23 -0400)]
Merge pull request #16903 from dillaman/wip-16877-luminous

luminous: test/librbd: fix race condition with OSD map refresh

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
7 years agotest/librbd: fix race condition with OSD map refresh 16903/head
Jason Dillaman [Mon, 7 Aug 2017 18:29:07 +0000 (14:29 -0400)]
test/librbd: fix race condition with OSD map refresh

Fixes: http://tracker.ceph.com/issues/20918
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 5c29664434c6e9e2f72aa4b02b369613250694e8)

7 years agoMerge pull request #16899 from dillaman/wip-20941-luminous
Jason Dillaman [Tue, 8 Aug 2017 13:58:28 +0000 (09:58 -0400)]
Merge pull request #16899 from dillaman/wip-20941-luminous

luminous: librbd: default localize parent reads to false

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
7 years agoMerge pull request #16895 from dillaman/wip-15339-luminous
Jason Dillaman [Tue, 8 Aug 2017 13:28:32 +0000 (09:28 -0400)]
Merge pull request #16895 from dillaman/wip-15339-luminous

luminous: rbd-ggate: tool to map images on FreeBSD via GEOM Gate

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Willem Jan Withagen <wjw@digiware.nl>
7 years agolibrbd: default localize parent reads to false 16899/head
Jason Dillaman [Mon, 7 Aug 2017 21:44:30 +0000 (17:44 -0400)]
librbd: default localize parent reads to false

Fixes: http://tracker.ceph.com/issues/20941
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit cfc3d4603668e23232c2f0b9af6fd838040f47ec)

7 years agotest: add wrapper to run rbd-ggate test on FreeBSD 16895/head
Mykola Golub [Sun, 6 Aug 2017 14:27:22 +0000 (16:27 +0200)]
test: add wrapper to run rbd-ggate test on FreeBSD

Signed-off-by: Mykola Golub <mgolub@mirantis.com>
7 years agorbd-ggate: tool to map images on FreeBSD via GEOM Gate
Mykola Golub [Sun, 14 May 2017 09:00:24 +0000 (09:00 +0000)]
rbd-ggate: tool to map images on FreeBSD via GEOM Gate

rbd-ggate spawns a process responsible for the creation of ggate
device and forwarding I/O requests between the GEOM Gate kernel
subsystem and RADOS.

On FreeBSD it provides functionality similar to rbd-nbd on Linux.

Signed-off-by: Mykola Golub <mgolub@mirantis.com>
7 years agorgw: Fix the last policy use-after-free
Adam C. Emerson [Mon, 7 Aug 2017 21:46:38 +0000 (17:46 -0400)]
rgw: Fix the last policy use-after-free

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 5353d952683a5a13a681c594e119b570bfdc3c39)

7 years agorgw: Fix another use after free
Adam C. Emerson [Mon, 7 Aug 2017 21:27:53 +0000 (17:27 -0400)]
rgw: Fix another use after free

This one was caused by iterator invalidation in set operations. In
this case just replace the set entirely with a bitfield.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 97d026dde679cabf1aaf026be3f08bfef63c140f)

7 years agorgw: Fix use after free in IAM policy parser
Adam C. Emerson [Mon, 24 Jul 2017 20:10:11 +0000 (16:10 -0400)]
rgw: Fix use after free in IAM policy parser

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 8377ba6525de5ebfe33a7dda14f17d96e8ac4ef4)

7 years agoqa/suites/upgrade/kraken-x/stress-split*: whitelist
Sage Weil [Mon, 7 Aug 2017 20:02:33 +0000 (16:02 -0400)]
qa/suites/upgrade/kraken-x/stress-split*: whitelist

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit bf29142b0828d953da09aab83a7fe44a5ce4fe78)

7 years agoqa/suites/upgrade/kraken-x/parallel: whitelist
Sage Weil [Mon, 7 Aug 2017 19:57:55 +0000 (15:57 -0400)]
qa/suites/upgrade/kraken-x/parallel: whitelist

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 2234a0ed11ba1b3688e2ae506a1128840507883d)

7 years agoqa/suites/upgrade/jewel-x/parallel: fix POOL_APP_NOT_ENABLED disable
Sage Weil [Mon, 7 Aug 2017 13:49:55 +0000 (09:49 -0400)]
qa/suites/upgrade/jewel-x/parallel: fix POOL_APP_NOT_ENABLED disable

This code runs on the mgr.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 3e7d157871880fc5c7acc436dcb4eafa078df128)

7 years agomon/MonCommands: mark 'pg force_create_pg' deprecated
Sage Weil [Sat, 5 Aug 2017 19:33:37 +0000 (15:33 -0400)]
mon/MonCommands: mark 'pg force_create_pg' deprecated

It's deprecated.

Also, this avoids a dup when we have an upgrading mon cluster
and it's also in PGMonitorCommands.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 7c37c86bb20ebc16c1d34a3acb7ad3f183d6e0e0)