]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
6 years agorgw_file: all directories are virtual with respect to contents 28887/head
Matt Benjamin [Fri, 7 Jun 2019 14:20:01 +0000 (10:20 -0400)]
rgw_file: all directories are virtual with respect to contents

This change causes directory handles to always report an mtime of
"now."  This is not an invalidate per se--it interacts with the
nfs implementation to produce that result when the implementation
updates its cached attributes.  Hence, it can be modulated by timers
or other rules governing attribute caching at the upper layer.

Fixes: http://tracker.ceph.com/issues/40204
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit b4c7d0faeff667c25ab255786999ef0cc844ea2b)

6 years agoMerge pull request #28029 from ifed01/wip-ifed-dump-before-nospanid-mimic
Nathan Cutler [Tue, 2 Jul 2019 08:24:58 +0000 (10:24 +0200)]
Merge pull request #28029 from ifed01/wip-ifed-dump-before-nospanid-mimic

mimic: os/bluestore: dump before "no-spanning blob id" abort

Reviewed-by: Neha Ojha <nojha@redhat.com>
6 years agoMerge pull request #28619 from xiexingguo/wip-40230
Yuri Weinstein [Mon, 1 Jul 2019 20:13:31 +0000 (13:13 -0700)]
Merge pull request #28619 from xiexingguo/wip-40230

mimic: mon, osd: parallel clean_pg_upmaps

Reviewed-by: Neha Ojha <nojha@redhat.com>
6 years agoMerge pull request #28138 from dillaman/wip-39186-mimic
Yuri Weinstein [Mon, 1 Jul 2019 20:06:17 +0000 (13:06 -0700)]
Merge pull request #28138 from dillaman/wip-39186-mimic

mimic: rbd: filter out group/trash snapshots from snap_list

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
6 years agoMerge pull request #28139 from dillaman/wip-38563-mimic
Yuri Weinstein [Mon, 1 Jul 2019 20:05:53 +0000 (13:05 -0700)]
Merge pull request #28139 from dillaman/wip-38563-mimic

mimic: librbd: race condition possible when validating RBD pool

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
6 years agoMerge pull request #28150 from trociny/wip-37691-mimic
Yuri Weinstein [Mon, 1 Jul 2019 20:05:20 +0000 (13:05 -0700)]
Merge pull request #28150 from trociny/wip-37691-mimic

mimic: librbd: disable image mirroring when moving to trash

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
6 years agoMerge pull request #28151 from dillaman/wip-38509-mimic
Yuri Weinstein [Mon, 1 Jul 2019 20:04:49 +0000 (13:04 -0700)]
Merge pull request #28151 from dillaman/wip-38509-mimic

mimic: librbd: add missing shutdown states to managed lock helper

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
6 years agoMerge pull request #28202 from dzafman/wip-39698
Yuri Weinstein [Mon, 1 Jul 2019 20:02:40 +0000 (13:02 -0700)]
Merge pull request #28202 from dzafman/wip-39698

mimic: osd: Don't include user changeable flag in snaptrim related assert

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
6 years agoMerge pull request #28232 from pdvian/wip-39518-mimic
Yuri Weinstein [Mon, 1 Jul 2019 20:01:19 +0000 (13:01 -0700)]
Merge pull request #28232 from pdvian/wip-39518-mimic

mimic: osd: Don't evict after a flush if intersecting scrub range

Reviewed-by: David Zafman <dzafman@redhat.com>
6 years agoMerge pull request #28259 from pdvian/wip-39538-mimic
Yuri Weinstein [Mon, 1 Jul 2019 20:00:46 +0000 (13:00 -0700)]
Merge pull request #28259 from pdvian/wip-39538-mimic

mimic: osd/PG: fix last_complete re-calculation on splitting

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
6 years agoMerge pull request #28503 from pdvian/wip-39737-mimic
Yuri Weinstein [Mon, 1 Jul 2019 19:57:35 +0000 (12:57 -0700)]
Merge pull request #28503 from pdvian/wip-39737-mimic

mimic: osd: Output Base64 encoding of CRC header if binary data present

Reviewed-by: David Zafman <dzafman@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
6 years agoMerge pull request #28540 from pdvian/wip-39744-mimic
Yuri Weinstein [Mon, 1 Jul 2019 19:57:01 +0000 (12:57 -0700)]
Merge pull request #28540 from pdvian/wip-39744-mimic

mimic: mon: paxos: introduce new reset_pending_committing_finishers for safety

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
6 years agoMerge pull request #28574 from smithfarm/wip-40280-mimic
Yuri Weinstein [Mon, 1 Jul 2019 19:56:10 +0000 (12:56 -0700)]
Merge pull request #28574 from smithfarm/wip-40280-mimic

mimic: bluestore: 50-100% iops lost due to bluefs_preextend_wal_files = false

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Reviewed-by: Sage Weil <sage@redhat.com>
6 years agotest: add parallel clean_pg_upmaps test 28619/head
xie xingguo [Mon, 3 Jun 2019 08:43:25 +0000 (16:43 +0800)]
test: add parallel clean_pg_upmaps test

With parallel clean_pg_upmaps feature on, the total time cost
of the performance test which now can utilize up to 8 threads for
parallel upmap validating decreased from:

maybe_remove_pg_upmaps (~10000 pg_upmap_items) latency:104s

to:

clean_pg_upmaps (~10000 pg_upmap_items) latency:7s

Note that by default the mon uses only 4 worker threads for
CPU intensive background work, you could further increase
the "mon_cpu_threads" option value if you decided the
time-consuming of clean_pg_upmaps still matters.

Fixes: http://tracker.ceph.com/issues/40104
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit a45112a9761d7b11f62cf4dac9f3a8e093cdd78f)

6 years agomon/OSDMonitor: do clean_pg_upmaps the parallel way if necessary
xie xingguo [Mon, 3 Jun 2019 08:10:22 +0000 (16:10 +0800)]
mon/OSDMonitor: do clean_pg_upmaps the parallel way if necessary

There could definitely be some certain cases we could reliably
skip this kind of checking, but there is no easy way to separate
those out.
However, this is clearly the general way to do the massive pg
upmap clean-up job more efficiently and hence should make sense
in all cases.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit c395f45f1f4d6f5e2b538a34730d9c92d8f9ae8b)

6 years agoosd/OSDMap: split clean_pg_upmaps into smaller helpers
xie xingguo [Sat, 1 Jun 2019 11:46:25 +0000 (19:46 +0800)]
osd/OSDMap: split clean_pg_upmaps into smaller helpers

- it's good to read.
- the updating pending_inc part should be made independent
  since it is going to be racy while running in parallel.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 4d5cf1e4173e5151cc571af571edb2eab0bb46a7)

Conflicts:
slight conflicts with pg-merge

6 years agoosd: maybe_remove_pg_upmaps -> clean_pg_upmaps
xie xingguo [Mon, 17 Jun 2019 10:44:09 +0000 (18:44 +0800)]
osd: maybe_remove_pg_upmaps -> clean_pg_upmaps

It should always be the preferred option to kill the unnecessary
or duplicated code, which is good for maintenance.
Also I've noticed there is already a clean_temps helper, so re-naming
maybe_remove_pg_upmaps to clean_pg_upmaps to at least keep pace with
that sounds to be a natural choice for me..

Master PR: https://github.com/ceph/ceph/pull/28373

This does not follow the normal backport process
since this piece of code has been changed a lot for the past 18 months..

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
6 years agoosd/OSDMapMapping: make ParallelPGMapper can accept input pgs
xie xingguo [Wed, 5 Jun 2019 02:41:52 +0000 (10:41 +0800)]
osd/OSDMapMapping: make ParallelPGMapper can accept input pgs

The existing "prime temp" machinism is a good inspiration
for cluster with a large number of pgs that need to do various
calculations quickly.
I am planning to do the upmap tidy-up work the same way, hence
the need for an alternate way of specifying pgs to process other
than taking directly from the map.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit f6fd4a312e0dda260f2c150334f06b531678ce47)

Conflicts:
s/g_conf./g_conf->/

6 years agoosd/OSDMap: maybe_remove_pg_upmaps - avoid do_crush twice
xie xingguo [Sat, 1 Jun 2019 06:22:39 +0000 (14:22 +0800)]
osd/OSDMap: maybe_remove_pg_upmaps - avoid do_crush twice

which is extremely time-consuming.
Half of the amount of time of calling maybe_remove_pg_upmaps
has been saved by applying this patch as a result..

Was: maybe_remove_pg_upmaps (~10000 pg_upmap_items) latency:104s
Now: maybe_remove_pg_upmaps (~10000 pg_upmap_items) latency:56s

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 02e5499b350bcd7d9eac98b2072052a9a4a1f535)

Conflicts:
- s/nextmap/tmpmap/
- drop std:: namespace

6 years agoosd/OSDMap: maybe_remove_pg_upmaps - s/pg_to_raw_up/pg_to_raw_upmap/
xie xingguo [Sat, 1 Jun 2019 04:13:19 +0000 (12:13 +0800)]
osd/OSDMap: maybe_remove_pg_upmaps - s/pg_to_raw_up/pg_to_raw_upmap/

The upmap results are directly applied after calling
_pg_to_raw_osds, which means it basically has nothing to do
with the up/down status.
In other words, if a pg_upmap/pg_upmap_items remapped a pg
into some down osds and is now causing collided result,
we should still be able to detect and cancel that.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit d9ed406a61c46858dd8350af5c72d7b8824dcdd3)

Conflicts:
s/nextmap/tmpmap/

6 years agotest/osd: add performance test case for maybe_remove_pg_upmap
xie xingguo [Sat, 1 Jun 2019 02:43:10 +0000 (10:43 +0800)]
test/osd: add performance test case for maybe_remove_pg_upmap

Tom Byrne reported that maybe_remove_pg_upmap might become
super inefficient for large clusters with balancer on.
To identify and resolve the problem, we need to add some good
measurements first.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit c0ce22b8c861fb76957b4cbbd59d9800e1ec09c3)

6 years agoosd/bluestore: Actually wait until completion in write_sync 28574/head
Vitaliy Filippov [Wed, 6 Mar 2019 23:01:18 +0000 (02:01 +0300)]
osd/bluestore: Actually wait until completion in write_sync

This function is only used by RocksDB WAL writing so it must sync data.

This fixes #18338 and thus allows to actually set `bluefs_preextend_wal_files`
to true, gaining +100% single-thread write iops in disk-bound (HDD or bad SSD) setups.
To my knowledge it doesn't hurt performance in other cases.
Test it yourself on any HDD with `fio -ioengine=rbd -direct=1 -bs=4k -iodepth=1`.

Issue #18338 is easily reproduced without this patch by issuing a `kill -9` to the OSD
while doing `fio -ioengine=rbd -direct=1 -bs=4M -iodepth=16`.

Fixes: https://tracker.ceph.com/issues/18338 https://tracker.ceph.com/issues/38559
Signed-off-by: Vitaliy Filippov <vitalif@yourcmc.ru>
(cherry picked from commit c703cf9a7632cbd9f17e148ef203509549a28571)

Conflicts:
src/os/bluestore/KernelDevice.cc
- mimic has a single variable "fd_buffered" where master has an array "fd_buffereds"

6 years agomon: paxos: introduce new reset_pending_committing_finishers for safety 28540/head
Greg Farnum [Mon, 29 Apr 2019 22:39:59 +0000 (15:39 -0700)]
mon: paxos: introduce new reset_pending_committing_finishers for safety

There are asserts about the state of the system and pending_finishers which can
be triggered by running arbitrary commands through again. They are correct
when not restarting, but when we do restart we need to take care to preserve
the same invariants as appropriate. Use this function to be careful about
the order of committing_finishers v pending_finishers and to make sure they're
both empty before any Contexts actually get called.

We also reorder a call to finish_contexts on the waiting_for_writeable list for
similar reasons.

Fixes: http://tracker.ceph.com/issues/39484
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit b17caec586ba2801593db61f91d66719d40b905e)

6 years agoosd: Output Base64 encoding of CRC header if binary data present 28503/head
David Zafman [Sat, 4 May 2019 18:32:40 +0000 (11:32 -0700)]
osd: Output Base64 encoding of CRC header if binary data present

Add optional paramter so cleanbin() for bufferlist can include "Base64:"

Fixes: https://tracker.ceph.com/issues/39582
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit eea239c03ca8aeb9a1da742b07d8627fbe2317e2)

Conflicts:
src/include/util.h : Resolved for cleanbin
src/osd/ReplicatedBackend.cc : Resolved for util.h

6 years agoMerge pull request #28457 from yuriw/wip-yuriw-40208-mimic
Yuri Weinstein [Mon, 10 Jun 2019 19:32:05 +0000 (12:32 -0700)]
Merge pull request #28457 from yuriw/wip-yuriw-40208-mimic

qa/tests: removed `1node` and `systemd` tests as ceph-deploy is not actively developed

Reviewed-by: Neha Ojha <nojha@redhat.com>
6 years agoqa/tests: removed `1node` and `systemd` tests as ceph-deploy is not actively developed 28457/head
Yuri Weinstein [Sat, 8 Jun 2019 16:17:53 +0000 (09:17 -0700)]
qa/tests: removed `1node` and `systemd` tests as ceph-deploy is not actively developed

Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
6 years agoMerge pull request #28389 from yuriw/wip-yuriw-whitelist-mimic
Yuri Weinstein [Tue, 4 Jun 2019 20:10:44 +0000 (13:10 -0700)]
Merge pull request #28389 from yuriw/wip-yuriw-whitelist-mimic

qa/tests: whitelisted  'application not enabled'

6 years agoqa/tests: whitelisted 'application not enabled' 28389/head
Yuri Weinstein [Tue, 4 Jun 2019 19:53:40 +0000 (12:53 -0700)]
qa/tests: whitelisted  'application not enabled'

Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
6 years ago13.2.6 v13.2.6
Jenkins Build Slave User [Mon, 3 Jun 2019 15:18:57 +0000 (15:18 +0000)]
13.2.6

6 years agoMerge pull request #28356 from yuriw/wip-yuriw-whitelist-mimic
Yuri Weinstein [Sun, 2 Jun 2019 20:33:06 +0000 (13:33 -0700)]
Merge pull request #28356 from yuriw/wip-yuriw-whitelist-mimic

qa/tests: whitelisted POOL_APP_NOT_ENABLED

6 years agoqa/tests: whitelisted POOL_APP_NOT_ENABLED 28356/head
Yuri Weinstein [Fri, 31 May 2019 21:04:19 +0000 (14:04 -0700)]
qa/tests: whitelisted POOL_APP_NOT_ENABLED

Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
6 years agoMerge pull request #28301 from yuriw/wip-yuriw-exl-python-mimic
Yuri Weinstein [Fri, 31 May 2019 14:55:40 +0000 (07:55 -0700)]
Merge pull request #28301 from yuriw/wip-yuriw-exl-python-mimic

qa/tests: added 'python3-cephfs','python3-rados' to excluded packadges

6 years agoqa/tests: added 'python3-cephfs','python3-rados' to excluded packadges 28301/head
Yuri Weinstein [Wed, 29 May 2019 19:19:39 +0000 (12:19 -0700)]
qa/tests: added 'python3-cephfs','python3-rados' to excluded packadges

Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
6 years agoMerge pull request #28129 from dillaman/wip-39728-mimic
Yuri Weinstein [Thu, 30 May 2019 20:40:15 +0000 (13:40 -0700)]
Merge pull request #28129 from dillaman/wip-39728-mimic

mimic: qa/workunits/rbd: use https protocol for devstack git operations

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
6 years agoMerge pull request #28184 from yuriw/wip-yuriw-fix-distro-mimic
Yuri Weinstein [Tue, 28 May 2019 21:42:01 +0000 (14:42 -0700)]
Merge pull request #28184 from yuriw/wip-yuriw-fix-distro-mimic

qa/tests: cleaned up supported distro

6 years agoosd/PG: fix last_complete re-calculation on splitting 28259/head
xie xingguo [Sat, 20 Apr 2019 08:34:12 +0000 (16:34 +0800)]
osd/PG: fix last_complete re-calculation on splitting

We add hard-limit for pg_logs now, which means we might keep trimming
old log entries irrespective of pg's current missing_set. This as a
result can cause the last_complete pointer moving far ahead of the real
on-disk version (the oldest need of missing_set, for instance) the
corresponding pg should have on splitting:

```
2019-04-19 06:41:52.559247 7efd4725c700 10 osd.2 271 Splitting pg[5.6( v 270'943 lc 0'0 (238'300,270'943] local-lis/les=250/251 n=943 ec=223/223 lis/c 250/223 les/251/224/0 250/271/229) [5,2] r=1 lpr=271 pi=[223,271)/4 crt=270'943 unknown NOTIFY m=518 mbc={}] into 5.16
2019-04-19 06:41:52.561413 7efd4725c700 10 osd.2 pg_epoch: 271 pg[5.6( v 270'943 lc 238'300 (238'300,270'943] local-lis/les=250/251 n=943 ec=223/223 lis/c 250/223 c/f 251/224/0 250/271/229) [5,2] r=1 lpr=271 pi=[223,271)/4 crt=270'943 lcod 0'0 unknown NOTIFY m=261 mbc={}] release_backoffs [MIN,MAX)
```

For the above example, parent's last_complete cursor changed from **0'0** to
**238'300** directly due to the effort of trying to catch up the oldest
log entry changing when splitting was done. However, back into v12.2.9 primary
would still reference shard's last_complete field when trying to figure out all
possible locations of a currently missing object (see PG::MissingLoc::add_source_info):

```c++
  if (oinfo.last_complete < need) {
    if (omissing.is_missing(soid)) {
      ldout(pg->cct, 10) << "search_for_missing " << soid << " " << need
                         << " also missing on osd." << fromosd << dendl;
      continue;
    }
  }
```

Hence a wrongly calculated last_complete could then make primary mis-consider
that a specific shard might have the authoritative object it currently
looking for:

```
2019-04-19 06:41:52.904163 7fd4cfb5a700 10 osd.5 pg_epoch: 271 pg[5.6( v 270'943 lc 238'300 (238'300,270'943] local-lis/les=250/251 n=471 ec=223/223 lis/c 250/223 les/
c/f 251/224/0 250/271/229) [5,2] r=0 lpr=271 pi=[223,271)/4 crt=270'943 lcod 226'77 mlcod 0'0 peering m=16 mbc={}] proc_replica_log for osd.2: 5.6( v 270'943 lc 238'30
0 (238'300,270'943] local-lis/les=250/251 n=471 ec=223/223 lis/c 250/223 les/c/f 251/224/0 250/271/229) log((249'563,270'943], crt=270'943) missing(261 may_include_del
etes = 1)
2019-04-19 06:41:52.904645 7fd4cfb5a700 20 osd.5 pg_epoch: 271 pg[5.6( v 270'943 lc 238'300 (238'300,270'943] local-lis/les=250/251 n=471 ec=223/223 lis/c 250/223 les/
c/f 251/224/0 250/271/229) [5,2] r=0 lpr=271 pi=[223,271)/4 crt=270'943 lcod 226'77 mlcod 0'0 peering m=16 mbc={}]  after missing 5:624c3a7a:::benchmark_data_smithi190
_39968_object1382:head need 226'110 have 0'0
2019-04-19 06:41:53.567820 7fd4d035b700 10 osd.5 pg_epoch: 272 pg[5.6( v 270'943 lc 0'0 (238'300,270'943] local-lis/les=271/272 n=471 ec=223/223 lis/c 250/223 les/c/f
251/224/0 250/271/229) [5,2] r=0 lpr=271 pi=[223,271)/4 crt=270'943 lcod 226'77 mlcod 0'0 unknown m=16 u=13 mbc={255={(1+0)=220,(2+0)=28}}] search_for_missing 5:624c3a
7a:::benchmark_data_smithi190_39968_object1382:head 226'110 is on osd.2
```

note that ```5:624c3a7a:::benchmark_data_smithi190_39968_object1382:head 226'110```
was actually missing on both primary and shard osd.2 whereas primary insisted that
object should exist on shard osd.2!

https://github.com/ceph/ceph/pull/26175 posted an indirect fix
for the above problem by ignoring last_complete when checking the missing set,
but it should generally make more sense to fill in the last_complete field correctly
whenever possible.
Hence coming this additional fix.

Fixes: http://tracker.ceph.com/issues/26958
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit aad5d47be64ed7feba79f540ec1debc45625a74f)

6 years agoosd: Don't evict after a flush if intersecting scrub range 28232/head
David Zafman [Tue, 26 Mar 2019 22:53:10 +0000 (15:53 -0700)]
osd: Don't evict after a flush if intersecting scrub range

Fixes: http://tracker.ceph.com/issues/38840
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 459cbb5a6ca3b521b20f36f328f25f398d0ef1c4)

6 years agoosd: Don't include user changeable flag in snaptrim related assert 28202/head
David Zafman [Tue, 30 Apr 2019 03:20:18 +0000 (20:20 -0700)]
osd: Don't include user changeable flag in snaptrim related assert

Caused by: a53ba7314c53e75d1e0b8a0edd29181db3c93863

Fixes: http://tracker.ceph.com/issues/38124
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 63c2060c2d9eef49dacf2ba240ccfbead9791f43)

Conflicts:
src/osd/PrimaryLogPG.h (Use assert() in mimic)

6 years agoqa/tests: cleaned up supported distro 28184/head
Yuri Weinstein [Mon, 20 May 2019 23:36:36 +0000 (16:36 -0700)]
qa/tests: cleaned up supported distro

Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
6 years agorbd-mirror: image replay should retry asok registration upon failure 28151/head
Jason Dillaman [Sat, 23 Feb 2019 00:25:44 +0000 (19:25 -0500)]
rbd-mirror: image replay should retry asok registration upon failure

If the asok registration fails (perhaps due to a race condition with
a deleted and recreated image of the same name), periodically attempt
to register the missing asok hook.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 2d2d3bbc791e807bb0c83072aaeee023116884ce)

6 years agorbd-mirror: failure to initialize pool replayer should stop leader
Jason Dillaman [Fri, 22 Feb 2019 18:52:43 +0000 (13:52 -0500)]
rbd-mirror: failure to initialize pool replayer should stop leader

The leader watcher should not start processing requests if it failed
to initialize.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 3723d44fa83f38572bf7587e1111005c8ba6ed69)

6 years agorbd-mirror: fixed potential bootstrap cancel race condition
Jason Dillaman [Fri, 22 Feb 2019 18:22:55 +0000 (13:22 -0500)]
rbd-mirror: fixed potential bootstrap cancel race condition

If the image replay was canceled prior to the start of the bootstrap
stage, the image replayer would be stuck attempting to shut down if
the bootstrap is paused behind an image sync.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 91b8a24ea58aee01ec8567195b5aabe11f74b87f)

Conflicts:
src/tools/rbd_mirror/ImageReplayer.cc: trivial resolution

6 years agorbd-mirror: complete pool watcher initialization if object missing
Jason Dillaman [Fri, 22 Feb 2019 15:59:26 +0000 (10:59 -0500)]
rbd-mirror: complete pool watcher initialization if object missing

If the mirroring object is missing, complete the initialization and
continue to retry in the background. This is useful for cases where
the remote doesn't (yet) have mirroring enabled but the remote
pool watcher initialization is delaying the leader watcher promotion
to the point where the leader is blacklisted by its peers.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 80954cd914c86e11ffb6a8cbfcb21202cb8131b5)

6 years agorbd-mirror: abort trash watcher initialization if create blacklisted
Jason Dillaman [Thu, 21 Feb 2019 17:37:38 +0000 (12:37 -0500)]
rbd-mirror: abort trash watcher initialization if create blacklisted

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 8b883d136be4454029af7c4c7e580936ad901156)

6 years agorbd-mirror: avoid attempting to reacquire lock if blacklisted
Jason Dillaman [Thu, 21 Feb 2019 17:37:00 +0000 (12:37 -0500)]
rbd-mirror: avoid attempting to reacquire lock if blacklisted

The pool replayer state machine will automatically be restarted
when a blacklist is detected.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit ff37c5c348ee811675007fc90c6b09f094ebf372)

6 years agolibrbd: add missing shutdown states to managed lock helper
Jason Dillaman [Tue, 19 Feb 2019 21:06:48 +0000 (16:06 -0500)]
librbd: add missing shutdown states to managed lock helper

The PRE_SHUTTING_DOWN and SHUTTING_DOWN states were missed
in the 'is_state_shutdown' helper method. This resulted in
rbd-mirror potentially entering an infinite loop during
shutdown.

http://tracker.ceph.com/issues/38387
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 82af5710ad49dd6e24c2736a9865e1a41add89a2)

6 years agoqa/workunits/rbd: delete pools before stopping rbd-mirror
Jason Dillaman [Tue, 19 Feb 2019 20:48:38 +0000 (15:48 -0500)]
qa/workunits/rbd: delete pools before stopping rbd-mirror

This better mimics the behavior of teuthology and tests rbd-mirror
daemon's ability to handle a pool deletion.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 9d694ba3515f6077d2f66b272d3df1ab082b7bea)

6 years agoqa/workunits/rbd: add trash move/restore mirror test 28150/head
Mykola Golub [Fri, 14 Dec 2018 16:47:00 +0000 (16:47 +0000)]
qa/workunits/rbd: add trash move/restore mirror test

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit b29b4f0c712cb2a2e97ce3de0b15d9075927b172)

6 years agolibrbd: disable image mirroring when moving to trash
Mykola Golub [Wed, 12 Dec 2018 15:42:49 +0000 (15:42 +0000)]
librbd: disable image mirroring when moving to trash

And enable when restoring if mirror pool mode set.

Fixes: https://tracker.ceph.com/issues/37596
Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 6844a52ebbc45a046b4a94909b48bc4fc0c305c7)

Conflicts:
src/librbd/api/Trash.cc (the code is in src/librbd/internal.cc)

6 years agolibrbd: create state machine uses new validate pool state machine 28139/head
Jason Dillaman [Wed, 27 Feb 2019 19:08:04 +0000 (14:08 -0500)]
librbd: create state machine uses new validate pool state machine

Fixes: http://tracker.ceph.com/issues/38500
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 6a84ca3f24421a5150ae32e7ff380f4121d32b13)

Conflicts:
src/librbd/image/CreateRequest.cc: trivial resolution

6 years agolibrbd: separate pool validation into a standalone state machine
Jason Dillaman [Wed, 27 Feb 2019 18:50:29 +0000 (13:50 -0500)]
librbd: separate pool validation into a standalone state machine

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit b3ee83b9ce9698a933c784dbb4a9e19a81847a50)

Conflicts:
        src/librbd/image/ValidatePoolRequest.cc: removed namespace support

6 years agorbd: fix error processing images which have non-user snapshots 28138/head
songweibin [Sat, 18 Aug 2018 01:05:03 +0000 (09:05 +0800)]
rbd: fix error processing images which have non-user snapshots

Signed-off-by: songweibin <song.weibin@zte.com.cn>
(cherry picked from commit 91c67b2a8d12ac03173a8e060dec1278f26b338a)

6 years agorbd: fix error purging non-user snapshots
songweibin [Sat, 18 Aug 2018 00:50:24 +0000 (08:50 +0800)]
rbd: fix error purging non-user snapshots

Fixes:
  [root@ ~]# rbd snap rm img1@snap1
  [root@ ~]# rbd snap ls img1 -a
  SNAPID NAME                                 SIZE    TIMESTAMP                NAMESPACE
       4 f2e82bd1-e2ff-4a6b-aaef-5a12a2b23a30 100 MiB Sat Aug 18 08:48:34 2018 trash (snap1)
  [root@ ~]# rbd snap purge img1
  Removing all snapshots: 0% complete...failed.
  rbd: removing snaps failed: (2) No such file or directory

Signed-off-by: songweibin <song.weibin@zte.com.cn>
(cherry picked from commit 2c79a4939090d445a8172dbbe4d4072a4851ddcf)

6 years agolibrbd: improve object map performance under high IOPS workloads 28136/head
Jason Dillaman [Thu, 28 Feb 2019 21:43:27 +0000 (16:43 -0500)]
librbd: improve object map performance under high IOPS workloads

Do not zero-fill the BitVector's bitset prior to decoding the data.
Additionally, only read-update-modify the portions of the footer
that are potentially affected by the updated state.

Fixes: http://tracker.ceph.com/issues/38538
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 071671fff64f27943047610fe075a7e98f0f705c)

Conflicts:
src/cls/rbd/cls_rbd.cc
src/cls/rbd/cls_rbd_client.h
src/common/bit_vector.hpp
src/test/common/test_bit_vector.cc
src/test/librbd/test_ObjectMap.cc
Trivial conflicts with bufferlist::begin/cbegin and assert/ceph_assert

6 years agoqa/workunits/rbd: use https protocol for devstack git operations 28129/head
Jason Dillaman [Thu, 9 May 2019 19:48:30 +0000 (15:48 -0400)]
qa/workunits/rbd: use https protocol for devstack git operations

Fixes: http://tracker.ceph.com/issues/39656
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit fb4f9a8a08edd18b1a23e1be4c1285d0ec4d1de6)

6 years agoMerge pull request #28096 from ivancich/mimic-wip-rgw-admin-unordered
J. Eric Ivancich [Wed, 15 May 2019 16:24:47 +0000 (12:24 -0400)]
Merge pull request #28096 from ivancich/mimic-wip-rgw-admin-unordered

mimic: rgw: allow radosgw-admin to list bucket w --allow-unordered

Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
6 years agoMerge pull request #26762 from pdvian/wip-38530-mimic
Yuri Weinstein [Wed, 15 May 2019 12:03:49 +0000 (05:03 -0700)]
Merge pull request #26762 from pdvian/wip-38530-mimic

mimic: rgw: data sync drains lease stack on lease failure

Reviewed-by: Casey Bodley <cbodley@redhat.com>
6 years agoMerge pull request #27973 from iain-buclaw-sociomantic/mimic-rgw-cls-bi-list-log...
Yuri Weinstein [Wed, 15 May 2019 12:03:07 +0000 (05:03 -0700)]
Merge pull request #27973 from iain-buclaw-sociomantic/mimic-rgw-cls-bi-list-log-level

mimic: cls/rgw: raise debug level of bi_log_iterate_entries output

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Nathan Cutler <ncutler@suse.com>
6 years agoMerge pull request #28086 from cbodley/wip-39411
Yuri Weinstein [Wed, 15 May 2019 12:02:19 +0000 (05:02 -0700)]
Merge pull request #28086 from cbodley/wip-39411

mimic: rgw: cls_bucket_list_unordered lists a single shard

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
6 years agorgw: allow radosgw-admin to list bucket w --allow-unordered 28096/head
J. Eric Ivancich [Wed, 8 May 2019 18:47:04 +0000 (14:47 -0400)]
rgw: allow radosgw-admin to list bucket w --allow-unordered

Presently the `radosgw-admin bucket list --bucket=<bucket>` lists the
objects in lexical order. This can be an expensive operation since
objects are not stored in bucket index shards in order and a selection
sort process is done across all bucket index shards.

By allowing the user to add the "--allow-unordered" command-line flag,
a more efficient bucket listing is enabled. This is particularly
important for buckets with a large number of objects.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit 12452c4a91ee56c10d3dcf970dd2b0dd5aeb2401)

6 years agorgw: cls_bucket_list_unordered lists a single shard 28086/head
Casey Bodley [Fri, 19 Apr 2019 22:38:47 +0000 (18:38 -0400)]
rgw: cls_bucket_list_unordered lists a single shard

CLSRGWIssueBucketList sends the request to every shard, but this loop
intended to list only the current_shard

Fixes: http://tracker.ceph.com/issues/39393
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit d37d0339ff61a293f2f9fd6dff3fbd630efce2a0)

Conflicts:
src/rgw/rgw_rados.cc: remove unnessary "struct"s

6 years agocls/rgw: expose cls_rgw_bucket_list_op for single shard
Casey Bodley [Fri, 19 Apr 2019 22:37:35 +0000 (18:37 -0400)]
cls/rgw: expose cls_rgw_bucket_list_op for single shard

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit cd1fc96c5ca5254eb8343509b11a59514d62e532)

Conflicts:
src/cls/rgw/cls_rgw_client.cc: remove unnessary "struct"s

6 years agocls/rgw: raise debug level of bi_log_iterate_entries output 27973/head
Casey Bodley [Fri, 14 Dec 2018 19:38:31 +0000 (14:38 -0500)]
cls/rgw: raise debug level of bi_log_iterate_entries output

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 9c8a207cef8792fc5a467b27b500153eef455c04)

6 years agoMerge pull request #27432 from smithfarm/wip-38540-mimic
Yuri Weinstein [Sat, 11 May 2019 16:21:00 +0000 (09:21 -0700)]
Merge pull request #27432 from smithfarm/wip-38540-mimic

mimic: qa: fsstress with valgrind may timeout

Reviewed-by: Venky Shankar <vshankar@redhat.com>
6 years agoMerge pull request #27847 from ashishkumsingh/wip-39469-mimic
Yuri Weinstein [Sat, 11 May 2019 16:20:32 +0000 (09:20 -0700)]
Merge pull request #27847 from ashishkumsingh/wip-39469-mimic

mimic : s: better output of 'ceph health detail'

Reviewed-by: Venky Shankar <vshankar@redhat.com>
6 years agoMerge pull request #27906 from smithfarm/wip-38736-mimic
Yuri Weinstein [Sat, 11 May 2019 16:20:03 +0000 (09:20 -0700)]
Merge pull request #27906 from smithfarm/wip-38736-mimic

mimic: qa: [WRN] Health check failed: 1/3 mons down, quorum b,c (MON_DOWN) in cluster log

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
6 years agoMerge pull request #27916 from pdvian/wip-39193-mimic
Yuri Weinstein [Sat, 11 May 2019 16:19:26 +0000 (09:19 -0700)]
Merge pull request #27916 from pdvian/wip-39193-mimic

mimic: mds: drop reconnect message from non-existent session

Reviewed-by: Venky Shankar <vshankar@redhat.com>
6 years agoMerge pull request #27917 from pdvian/wip-39200-mimic
Yuri Weinstein [Sat, 11 May 2019 16:18:59 +0000 (09:18 -0700)]
Merge pull request #27917 from pdvian/wip-39200-mimic

mimic: mds/server: check directory split after rename.

Reviewed-by: Venky Shankar <vshankar@redhat.com>
6 years agoMerge pull request #28014 from cbodley/wip-39614
Yuri Weinstein [Fri, 10 May 2019 15:30:06 +0000 (08:30 -0700)]
Merge pull request #28014 from cbodley/wip-39614

mimic: rgw: use chunked encoding to get partial results out faster

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
6 years agoMerge pull request #27938 from pdvian/wip-39206-mimic
Yuri Weinstein [Thu, 9 May 2019 15:47:53 +0000 (08:47 -0700)]
Merge pull request #27938 from pdvian/wip-39206-mimic

mimic: osd: shutdown recovery_request_timer earlier

Reviewed-by: Neha Ojha <nojha@redhat.com>
6 years agoMerge pull request #27963 from xiexingguo/wip-mimic-upmap-fixes
Yuri Weinstein [Thu, 9 May 2019 15:46:53 +0000 (08:46 -0700)]
Merge pull request #27963 from xiexingguo/wip-mimic-upmap-fixes

mimic: crush: backport recent upmap fixes

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Jan Fajerski <jfajerski@suse.com>
6 years agoMerge pull request #27940 from pdvian/wip-39220-mimic
Yuri Weinstein [Wed, 8 May 2019 19:41:56 +0000 (12:41 -0700)]
Merge pull request #27940 from pdvian/wip-39220-mimic

mimic: osd: FAILED ceph_assert(attrs || !pg_log.get_missing().is_missing(soid) || (it_objects != pg_log.get_log().objects.end() && it_objects->second->op == pg_log_entry_t::LOST_REVERT)) in PrimaryLogPG::get_object_context()

Reviewed-by: Neha Ojha <nojha@redhat.com>
6 years agoMerge pull request #27055 from ifed01/wip-ifed-fix-storetest-mimic
Yuri Weinstein [Wed, 8 May 2019 19:20:00 +0000 (12:20 -0700)]
Merge pull request #27055 from ifed01/wip-ifed-fix-storetest-mimic

mimic: test/store_test: fix/workaround for BlobReuseOnOverwriteUT and garbageCollection

Reviewed-by: Sage Weil <sage@redhat.com>
6 years agoMerge pull request #27907 from smithfarm/wip-38443-mimic
Yuri Weinstein [Wed, 8 May 2019 19:19:32 +0000 (12:19 -0700)]
Merge pull request #27907 from smithfarm/wip-38443-mimic

mimic: tests: osd-markdown.sh can fail with CLI_DUP_COMMAND=1

Reviewed-by: Sage Weil <sage@redhat.com>
6 years agoMerge pull request #27943 from smithfarm/wip-38879-mimic
Yuri Weinstein [Wed, 8 May 2019 19:18:18 +0000 (12:18 -0700)]
Merge pull request #27943 from smithfarm/wip-38879-mimic

mimic: core: ENOENT in collection_move_rename on EC backfill target

Reviewed-by: Neha Ojha <nojha@redhat.com>
6 years agoos/bluestore: dump onode meta before "no spanning blob" assertion. 28029/head
Igor Fedotov [Wed, 1 May 2019 22:27:43 +0000 (01:27 +0300)]
os/bluestore: dump onode meta before "no spanning blob" assertion.

Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit 4464419dde0f22ad0d5a636bf3992c28ff357ef6)

 Conflicts:
src/os/bluestore/BlueStore.cc
 Trivial

6 years agoos/bluestore: move _dump_xxx methods out of BlueStore class
Igor Fedotov [Wed, 1 May 2019 13:47:20 +0000 (16:47 +0300)]
os/bluestore: move _dump_xxx methods out of BlueStore class

Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit 70640aaa12540057df9ac86d4336cdce81f10f07)

 Conflicts:
src/os/bluestore/BlueStore.cc
 Trivial.

6 years agorgw: use chunked encoding to get partial results out faster 28014/head
Robin H. Johnson [Thu, 23 Aug 2018 17:57:24 +0000 (10:57 -0700)]
rgw: use chunked encoding to get partial results out faster

Some operations can take a long time to have their complete result.

If a RGW operation does not set a content-length header, the RGW
frontends (CivetWeb, Beast) buffer the entire request so that a
Content-Length header can be sent.

If the RGW operation takes long enough, the buffering time may exceed
keepalive values, and because no bytes have been sent in the connection,
the connection will be reset.

If a HTTP response header contains neither Content-Length or chunked
Transfer-Encoding, HTTP keep-alive is not possible.

To fix the issue within these requirements, use chunked
Transfer-Encoding for the following operations:

* RGWCopyObj_ObjStore_S3 **
* RGWDeleteMultiObj_ObjStore_S3 **
* RGWGetUsage_ObjStore_S3
* RGWListBucketMultiparts_ObjStore_S3
* RGWListBucket_ObjStore_S3
* RGWListBuckets_ObjStore_S3
* RGWListMultipart_ObjStore_S3

RGWCopyObj & RGWDeleteMultiObj specifically use send_partial_response
for long-running operations, and are the most impacted by this issue,
esp. for large inputs. RGWCopyObj attempts to send a Progress header
during the copy, but it's not actually passed on to the client until the
end of the copy, because it's buffered by the RGW frontends!

The HTTP/1.1 specification REQUIRES chunked encoding to be supported,
and the specification does NOT require "chunked" to be included in the
"TE" request header.

This patch has one side-effect: this causes many more small IP packets.
When combined with high-latency links this can increase the apparent
deletion time due to round trips and TCP slow start. Future improvements
to the RGW frontends are possible in two seperate but related ways:
- The FE could continue to push more chunks without waiting for the ACK
  on the previous chunk, esp. while under the TCP window size.
- The FE could be patched for different buffer flushing behaviors, as
  that behavior is presently unclear (packets of 200-500 bytes seen).

Performance results:
- Bucket with 5M objects, index sharded 32 ways.
- Index on SSD 3x replicas, Data on spinning disk, 5:2
- Multi-delete of 1000 keys, with a common prefix.
- Cache of index primed by listing the common prefix immediately before
  deletion.
- Timing data captured at the RGW.
- Timing t0 is the TCP ACK sent by the RGW at the end of the response
  body.
- Client is ~75ms away from RGW.
BEFORE:
Time to first byte of response header: 11.3 seconds.
Entire operation: 11.5 seconds.
Response packets: 17
AFTER:
Time to first byte of response header: 3.5ms
Entire operation: 16.36 seconds
Response packets: 206

Backport: mimic, luminous
Issue: http://tracker.ceph.com/issues/12713
Signed-off-by: Robin H. Johnson <rjohnson@digitalocean.com>
(cherry picked from commit d22c1f96707ba9ae84578932bd4d741f6c101a54)

6 years agoMerge pull request #27972 from smithfarm/wip-37348-mimic
Casey Bodley [Tue, 7 May 2019 18:47:59 +0000 (14:47 -0400)]
Merge pull request #27972 from smithfarm/wip-37348-mimic

mimic: rgw: when using nfs-ganesha to upload file, rgw es sync module get failed

Reviewed-by: Abhishek Lekshmanan <abhishek@suse.com>
6 years agoMerge pull request #27948 from smithfarm/wip-37498-mimic
Yuri Weinstein [Tue, 7 May 2019 15:44:51 +0000 (08:44 -0700)]
Merge pull request #27948 from smithfarm/wip-37498-mimic

mimic: rgw: get or set realm zonegroup zone should check user's caps for security

Reviewed-by: Casey Bodley <cbodley@redhat.com>
6 years agoMerge pull request #27356 from pdvian/wip-38922-mimic
Yuri Weinstein [Mon, 6 May 2019 16:28:21 +0000 (09:28 -0700)]
Merge pull request #27356 from pdvian/wip-38922-mimic

mimic: rgw: Fix S3 compatibility bug when CORS is not found

Reviewed-by: Casey Bodley <cbodley@redhat.com>
6 years agoMerge pull request #27367 from pdvian/wip-38924-mimic
Yuri Weinstein [Mon, 6 May 2019 16:27:53 +0000 (09:27 -0700)]
Merge pull request #27367 from pdvian/wip-38924-mimic

mimic: rgw: Adding tcp_nodelay option to Beast

Reviewed-by: Casey Bodley <cbodley@redhat.com>
6 years agoMerge pull request #27421 from pdvian/wip-38959-mimic
Yuri Weinstein [Mon, 6 May 2019 16:27:31 +0000 (09:27 -0700)]
Merge pull request #27421 from pdvian/wip-38959-mimic

mimic: rgw-admin: fix data sync report for master zone

Reviewed-by: Casey Bodley <cbodley@redhat.com>
6 years agoMerge pull request #27603 from pdvian/wip-39017-mimic
Yuri Weinstein [Mon, 6 May 2019 16:27:05 +0000 (09:27 -0700)]
Merge pull request #27603 from pdvian/wip-39017-mimic

mimic: rgw admin: add tenant argument to reshard cancel

Reviewed-by: Casey Bodley <cbodley@redhat.com>
6 years agoMerge pull request #27661 from pdvian/wip-39049-mimic
Yuri Weinstein [Mon, 6 May 2019 16:26:36 +0000 (09:26 -0700)]
Merge pull request #27661 from pdvian/wip-39049-mimic

mimic: rgw: beast: set a default port for endpoints

Reviewed-by: Casey Bodley <cbodley@redhat.com>
6 years agoMerge pull request #27666 from smithfarm/wip-39359-mimic
Yuri Weinstein [Mon, 6 May 2019 16:26:04 +0000 (09:26 -0700)]
Merge pull request #27666 from smithfarm/wip-39359-mimic

mimic: rgw: failed to pass test_bucket_create_naming_bad_punctuation in s3test

Reviewed-by: Casey Bodley <cbodley@redhat.com>
6 years agoMerge pull request #27796 from cbodley/wip-38713
Yuri Weinstein [Mon, 6 May 2019 16:23:01 +0000 (09:23 -0700)]
Merge pull request #27796 from cbodley/wip-38713

mimic: rgw: resolve bugs and clean up garbage collection code

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
6 years agoMerge pull request #27828 from ashishkumsingh/wip-39498-mimic
Yuri Weinstein [Mon, 6 May 2019 16:22:32 +0000 (09:22 -0700)]
Merge pull request #27828 from ashishkumsingh/wip-39498-mimic

mimic: rgw: admin: handle delete_at attr in object stat output

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@gmail.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
6 years agoMerge pull request #26140 from ashishkumsingh/wip-37909-mimic
Yuri Weinstein [Mon, 6 May 2019 16:18:24 +0000 (09:18 -0700)]
Merge pull request #26140 from ashishkumsingh/wip-37909-mimic

mimic: rbd_mirror: don't report error if image replay canceled

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
6 years agoMerge pull request #27391 from pdvian/wip-38955-mimic
Yuri Weinstein [Mon, 6 May 2019 16:17:57 +0000 (09:17 -0700)]
Merge pull request #27391 from pdvian/wip-38955-mimic

mimic: tests: krbd discard qa fixes

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
6 years agoMerge pull request #27588 from pdvian/wip-38976-mimic
Yuri Weinstein [Mon, 6 May 2019 16:17:22 +0000 (09:17 -0700)]
Merge pull request #27588 from pdvian/wip-38976-mimic

mimic: rbd: krbd: return -ETIMEDOUT in polling

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
6 years agoMerge pull request #27958 from dillaman/wip-39585-mimic
Yuri Weinstein [Mon, 6 May 2019 16:16:48 +0000 (09:16 -0700)]
Merge pull request #27958 from dillaman/wip-39585-mimic

mimic: qa/tasks/rbd_fio: fixed missing delimiter between 'cd' and 'configure'

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
6 years agoMerge pull request #27959 from dillaman/wip-39583-mimic
Yuri Weinstein [Mon, 6 May 2019 16:16:20 +0000 (09:16 -0700)]
Merge pull request #27959 from dillaman/wip-39583-mimic

mimic: qa/workunits/rbd: use more recent qemu-iotests that support Bionic

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
6 years agoMerge pull request #26655 from jan--f/c-v-simple-activate-all-mimic
Alfredo Deza [Mon, 6 May 2019 15:15:21 +0000 (11:15 -0400)]
Merge pull request #26655 from jan--f/c-v-simple-activate-all-mimic

mimic: ceph-volume: add --all flag to simple activate

Reviewed-by: Alfredo Deza <adeza@redhat.com>
6 years agorgw: ES sync: wrap all the decode bls in try block 27972/head
Abhishek Lekshmanan [Tue, 9 Oct 2018 11:52:22 +0000 (13:52 +0200)]
rgw: ES sync: wrap all the decode bls in try block

since decode can throw, wrap all the try block

Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
(cherry picked from commit 12b12ccea23871688cc4101c72f00b0575f1c01a)

Conflicts:
src/rgw/rgw_sync_module_es.cc
- mimic uses val.begin() where master has val.cbegin()

6 years agoosd/OSDMap: add log for better debugging 27963/head
xie xingguo [Mon, 25 Mar 2019 10:24:16 +0000 (18:24 +0800)]
osd/OSDMap: add log for better debugging

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit a89281ffbb50a4dfc700398e864138b5faaf00f5)

6 years agoosd/OSDMap: calc_pg_upmaps - restrict optimization to origin pools only
xie xingguo [Sat, 23 Mar 2019 01:50:27 +0000 (09:50 +0800)]
osd/OSDMap: calc_pg_upmaps - restrict optimization to origin pools only

The current implementation will try to cancel any pg_upmaps that
would otherwise re-map a PG out from an underfull osd, which is wrong,
e.g., because it could reliably fire the following assert:

src/osd/OSDMap.cc: 4405: FAILED assert(osd_weight.count(i.first))

Also it would not match the expectation if automatic balancing
has been strictly restricted to some specific pools by admin.

Fix by excluding any wild PG that does not belong to the origin pools
passed in when trying to do upmap/unmap.

Fixes: http://tracker.ceph.com/issues/38897
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 01e8e9482ce7194d347e02ef41acfa6d8d14f614)

6 years agoosd/OSDMap: drop local pool filter in calc_pg_upmaps
xie xingguo [Sat, 23 Feb 2019 00:33:40 +0000 (08:33 +0800)]
osd/OSDMap: drop local pool filter in calc_pg_upmaps

The local pre-loaded pool filter is completely redundant since
the below check:

if (!only_pools.empty() && !only_pools.count(i.first))

could reliably catch both cases - either optimization should be
restricted to specific pools feeded, or all existing pools.

Let's clean it up.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 3e6bad9821b5fb3e780d970666fbdfbf217d905e)

6 years agocrush: fix upmap overkill
xie xingguo [Sat, 19 Jan 2019 09:19:10 +0000 (17:19 +0800)]
crush: fix upmap overkill

It appears that OSDMap::maybe_remove_pg_upmaps's sanity checks
are overzealous. With some customized crush rules it is possible
for osdmaptool to generate valid upmaps, but maybe_remove_pg_upmaps
will cancel them.

Fixes: http://tracker.ceph.com/issues/37968
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 5c4d241c7f796cb685e9944bf237028162122725)

Conflicts:
- maybe_remove_pg_upmaps input changing
        - slight c++11 auto conflicts

6 years agoosd/OSDMap: using std::vector::reserve to reduce memory reallocation
xie xingguo [Mon, 18 Feb 2019 07:40:22 +0000 (15:40 +0800)]
osd/OSDMap: using std::vector::reserve to reduce memory reallocation

In C++ vectors are dynamic arrays.
Vectors are assigned memory in blocks of contiguous locations.
When the memory allocated for the vector falls short of storing
new elements, a new memory block is allocated to vector and all
elements are copied from the old location to the new location.
This reallocation of elements helps vectors to grow when required.
However, it is a costly operation and time complexity is involved
in this step is linear.
Try to use std::vector::reserve whenever possible if performance
matters.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 4a0eabb3a65107cbee5e692ade564102e2b2f8aa)