]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
7 years agoMerge pull request #22176 from pdvian/wip-24107-luminous
Yan, Zheng [Sat, 2 Jun 2018 00:35:33 +0000 (08:35 +0800)]
Merge pull request #22176 from pdvian/wip-24107-luminous

luminous: mds: set could_consume to false when no purge queue item actually exe…

7 years agoMerge pull request #22310 from ukernel/luminous-24341
Yan, Zheng [Sat, 2 Jun 2018 00:34:47 +0000 (08:34 +0800)]
Merge pull request #22310 from ukernel/luminous-24341

luminous: mds: fix some memory leak

7 years agoMerge pull request #22271 from pdvian/wip-24205-luminous
Yan, Zheng [Sat, 2 Jun 2018 00:34:19 +0000 (08:34 +0800)]
Merge pull request #22271 from pdvian/wip-24205-luminous

luminous: mds: broadcast quota to relevant clients when quota is explicitly set

7 years agoMerge pull request #22221 from pdvian/wip-24201-luminous
Yan, Zheng [Sat, 2 Jun 2018 00:33:28 +0000 (08:33 +0800)]
Merge pull request #22221 from pdvian/wip-24201-luminous

luminous: client: fix issue of revoking non-auth caps

7 years agoMerge pull request #22208 from pdvian/wip-24188-luminous
Yan, Zheng [Sat, 2 Jun 2018 00:32:54 +0000 (08:32 +0800)]
Merge pull request #22208 from pdvian/wip-24188-luminous

luminous: kceph: umount on evicted client blocks forever

7 years agoMerge pull request #22171 from ukernel/luminous-24108
Yan, Zheng [Sat, 2 Jun 2018 00:32:03 +0000 (08:32 +0800)]
Merge pull request #22171 from ukernel/luminous-24108

luminous: mds: avoid calling rejoin_gather_finish() two times successively

7 years agoMerge pull request #22168 from ukernel/luminous-24207
Yan, Zheng [Sat, 2 Jun 2018 00:31:22 +0000 (08:31 +0800)]
Merge pull request #22168 from ukernel/luminous-24207

luminous: client: avoid freeing inode when it contains TX buffer head

7 years agoMerge pull request #22118 from pdvian/wip-24050-luminous
Yan, Zheng [Sat, 2 Jun 2018 00:30:30 +0000 (08:30 +0800)]
Merge pull request #22118 from pdvian/wip-24050-luminous

luminous: mds: include nfiles/nsubdirs of directory inode in MClientCaps

7 years agoMerge pull request #22018 from batrick/i23991
Yan, Zheng [Sat, 2 Jun 2018 00:30:11 +0000 (08:30 +0800)]
Merge pull request #22018 from batrick/i23991

luminous: client: hangs on umount if it had an MDS session evicted

7 years agoMerge pull request #21990 from batrick/i23935
Yan, Zheng [Sat, 2 Jun 2018 00:28:56 +0000 (08:28 +0800)]
Merge pull request #21990 from batrick/i23935

luminous: mds: don't discover inode/dirfrag when mds is in 'starting' state

7 years agoMerge pull request #21989 from batrick/i24130
Yan, Zheng [Sat, 2 Jun 2018 00:28:02 +0000 (08:28 +0800)]
Merge pull request #21989 from batrick/i24130

luminous: mds: handle imported session race

7 years agoMerge pull request #21922 from pdvian/wip-23984-luminous
Yan, Zheng [Sat, 2 Jun 2018 00:27:43 +0000 (08:27 +0800)]
Merge pull request #21922 from pdvian/wip-23984-luminous

luminous: mds: mark new root inode dirty

7 years agoMerge pull request #21921 from pdvian/wip-23982-luminous
Yan, Zheng [Sat, 2 Jun 2018 00:27:16 +0000 (08:27 +0800)]
Merge pull request #21921 from pdvian/wip-23982-luminous

luminous: qa: fix blacklisted check for test_lifecycle

7 years agoMerge pull request #21901 from pdvian/wip-23951-luminous
Yan, Zheng [Sat, 2 Jun 2018 00:26:49 +0000 (08:26 +0800)]
Merge pull request #21901 from pdvian/wip-23951-luminous

luminous: mds: kick rdlock if waiting for dirfragtreelock

7 years agoMerge pull request #21900 from pdvian/wip-23946-luminous
Yan, Zheng [Sat, 2 Jun 2018 00:26:28 +0000 (08:26 +0800)]
Merge pull request #21900 from pdvian/wip-23946-luminous

luminous: mds: crash when failover

7 years agoMerge pull request #21874 from pdvian/wip-23936-luminous
Yan, Zheng [Sat, 2 Jun 2018 00:25:33 +0000 (08:25 +0800)]
Merge pull request #21874 from pdvian/wip-23936-luminous

luminous: cephfs-journal-tool: wait prezero ops before destroying journal

7 years agoMerge pull request #21841 from pdvian/wip-23931-luminous
Yan, Zheng [Sat, 2 Jun 2018 00:24:58 +0000 (08:24 +0800)]
Merge pull request #21841 from pdvian/wip-23931-luminous

luminous: qa: remove racy/buggy test_purge_queue_op_rate

7 years agoMerge pull request #21730 from joscollin/wip-23933-luminous
Yan, Zheng [Sat, 2 Jun 2018 00:24:31 +0000 (08:24 +0800)]
Merge pull request #21730 from joscollin/wip-23933-luminous

luminous: client: avoid second lock on client_lock

7 years agoMerge pull request #21617 from pdvian/wip-23835-luminous
Yan, Zheng [Sat, 2 Jun 2018 00:23:21 +0000 (08:23 +0800)]
Merge pull request #21617 from pdvian/wip-23835-luminous

luminous: mds: fix occasional dir rstat inconsistency between multi-MDSes

7 years agoMerge pull request #21600 from joscollin/wip-23475-luminous
Yan, Zheng [Sat, 2 Jun 2018 00:22:30 +0000 (08:22 +0800)]
Merge pull request #21600 from joscollin/wip-23475-luminous

luminous: ceph-fuse: trim ceph-fuse -V output

7 years agoMerge pull request #21616 from joscollin/wip-23308-luminous
Yan, Zheng [Sat, 2 Jun 2018 00:21:41 +0000 (08:21 +0800)]
Merge pull request #21616 from joscollin/wip-23308-luminous

luminous: doc: Fix -d description in ceph-fuse

7 years agoMerge pull request #21899 from pdvian/wip-23950-luminous
Yan, Zheng [Sat, 2 Jun 2018 00:19:58 +0000 (08:19 +0800)]
Merge pull request #21899 from pdvian/wip-23950-luminous

luminous: mds: trim log during shutdown to clean metadata

7 years agoMerge pull request #21589 from pdvian/wip-23818-luminous
Yan, Zheng [Sat, 2 Jun 2018 00:15:10 +0000 (08:15 +0800)]
Merge pull request #21589 from pdvian/wip-23818-luminous

luminous: client: add client option descriptions

7 years agoMerge pull request #21687 from batrick/i23638
Yan, Zheng [Sat, 2 Jun 2018 00:13:38 +0000 (08:13 +0800)]
Merge pull request #21687 from batrick/i23638

luminous: ceph-fuse: getgroups failure causes exception

7 years agomds: trim log during shutdown to clean metadata 21899/head
Patrick Donnelly [Sun, 29 Apr 2018 00:17:53 +0000 (17:17 -0700)]
mds: trim log during shutdown to clean metadata

Otherwise the trimming won't advance so that the remaining inodes are marked
clean.

Fixes: http://tracker.ceph.com/issues/23923
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit c60ef1b806c4a0c60362193675990447d82a65f4)

7 years agoMerge pull request #22191 from alfredodeza/backport-wip-cv-ansible-deps
Alfredo Deza [Wed, 30 May 2018 16:10:33 +0000 (12:10 -0400)]
Merge pull request #22191 from alfredodeza/backport-wip-cv-ansible-deps

lumionus ceph-volume tests.functional install new ceph-ansible dependencies

Reviewed-by: Andrew Schoen <aschoen@redhat.com>
7 years agomds: fix leak of MDSCacheObject::waiting 22310/head
Yan, Zheng [Wed, 30 May 2018 03:23:25 +0000 (11:23 +0800)]
mds: fix leak of MDSCacheObject::waiting

Fixes: http://tracker.ceph.com/issues/24289
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit 8f3c8bf6eafd3545c3c786b8520e8ff2c40af2a0)

7 years agomds: fix some memory leak
Yan, Zheng [Fri, 25 May 2018 08:11:30 +0000 (16:11 +0800)]
mds: fix some memory leak

Fixes: http://tracker.ceph.com/issues/24289
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit e7c149b93dc384ee4a2c8250c502548d12535123)

7 years agomds: broadcast quota to relevant clients when quota is explicitly set 22271/head
Zhi Zhang [Wed, 16 May 2018 03:21:48 +0000 (11:21 +0800)]
mds: broadcast quota to relevant clients when quota is explicitly set

Try to broadcast quota to relevant clients proactively if quota is
explicitly set by someone, in case that client won't get quota update
for a long time.

Fixes: http://tracker.ceph.com/issues/24133
Signed-off-by: Zhi Zhang <zhangz.david@outlook.com>
(cherry picked from commit b2a7643b102dbbb8221dcb8a785db5e4276ac284)

7 years agodoc: Fix typo in ceph-fuse 21616/head
Jos Collin [Thu, 24 May 2018 11:57:02 +0000 (17:27 +0530)]
doc: Fix typo in ceph-fuse

Fixes: https://github.com/ceph/ceph/pull/21616#pullrequestreview-122923127
Signed-off-by: Jos Collin <jcollin@redhat.com>
(cherry picked from commit 7fd3189c98b0b1c2885110c2c33487ef36a9596a)

7 years agoMerge pull request #21603 from joscollin/wip-23151-luminous
Kefu Chai [Thu, 24 May 2018 09:57:12 +0000 (17:57 +0800)]
Merge pull request #21603 from joscollin/wip-23151-luminous

luminous: doc: Update ceph-fuse doc

Reviewed-by: Sage Weil <sage@redhat.com>
7 years agoclient: fix issue of revoking non-auth caps 22221/head
Yan, Zheng [Fri, 18 May 2018 06:26:32 +0000 (14:26 +0800)]
client: fix issue of revoking non-auth caps

when non-auth mds revokes caps, Fcb caps can still be issued by auth
auth mds. It's wrong to flush buffer or invalidate cache when non-auth
mds revokes other caps. This bug can cause client to not respond the
revoke.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Fixes: https://tracker.ceph.com/issues/24172
(cherry picked from commit 341a9114e0726e1a7cbb7e6f22adb54c2024c506)

7 years agoMerge pull request #22076 from tchaikov/wip-cmake-build-rocksdb-no-Werror
Kefu Chai [Thu, 24 May 2018 08:56:19 +0000 (16:56 +0800)]
Merge pull request #22076 from tchaikov/wip-cmake-build-rocksdb-no-Werror

luminous: cmake: disable FAIL_ON_WARNINGS for rocksdb

Reviewed-by: Nathan Cutler <cutler@suse.cz>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
7 years agoMerge pull request #22197 from dzafman/wip-test-fixes-luminous
Kefu Chai [Thu, 24 May 2018 07:06:31 +0000 (15:06 +0800)]
Merge pull request #22197 from dzafman/wip-test-fixes-luminous

luminous: test fixes

Reviewed-by: Kefu Chai <kchai@redhat.com>
7 years agoqa/tasks/cephfs: add timeout parameter to kclient umount_wait 22208/head
Yan, Zheng [Fri, 11 May 2018 12:26:43 +0000 (20:26 +0800)]
qa/tasks/cephfs: add timeout parameter to kclient umount_wait

Just make caller happy. there is no easy way to support timeout.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
Fixes: https://tracker.ceph.com/issues/24053
(cherry picked from commit e7d0b41deae7ec99ddf0a1f5f30ea82683b7b474)

7 years agomds: reply session reject for open request from blacklisted client
Yan, Zheng [Fri, 11 May 2018 06:55:12 +0000 (14:55 +0800)]
mds: reply session reject for open request from blacklisted client

Kernel client and old version libcephfs do not check if themselves
are blacklisted. They can be stuck at opening session after getting
blacklisted. The session reject message can avoid this.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Fixes: https://tracker.ceph.com/issues/24054
(cherry picked from commit b7c6cd8a54f094acb58603b8c6bae9e570a73e27)

7 years agotest: wait_for_pg_stats() should do another check after last 13 second sleep 22197/head
David Zafman [Wed, 23 May 2018 19:36:44 +0000 (12:36 -0700)]
test: wait_for_pg_stats() should do another check after last 13 second sleep

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 151de1797b9163918b95a5996f422688e0964126)

7 years agoos/bluestore: fix data read error injection in bluestore
Sage Weil [Mon, 8 Jan 2018 22:27:51 +0000 (16:27 -0600)]
os/bluestore: fix data read error injection in bluestore

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit be32d15a04d9d900f604aa366e82791249f1bdb2)

7 years agoceph-volume tests.functional install new ceph-ansible dependencies 22191/head
Alfredo Deza [Mon, 21 May 2018 11:11:28 +0000 (07:11 -0400)]
ceph-volume tests.functional install new ceph-ansible dependencies

Make note that ceph-ansible's requirements.txt can't be used just yet

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit 22310f43165e474e8e12732be57217b26e2b5424)

7 years agomds: set could_consume to false when no purge queue item actually executed 22176/head
Xuehan Xu [Thu, 10 May 2018 04:22:24 +0000 (12:22 +0800)]
mds: set could_consume to false when no purge queue item actually executed

Fixes: http://tracker.ceph.com/issues/24073
Signed-off-by: Xuehan Xu <xuxuehan@360.cn>
(cherry picked from commit 46b4e6afa631058fe066bfd58c76d644d5c2181d)

7 years agoMerge pull request #21502 from smithfarm/wip-23782-luminous
Kefu Chai [Wed, 23 May 2018 09:41:51 +0000 (17:41 +0800)]
Merge pull request #21502 from smithfarm/wip-23782-luminous

luminous: table of contents doesn't render for luminous/jewel docs

Reviewed-by: Alfredo Deza <adeza@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
7 years agomds: tighten conditions of calling rejoin_gather_finish() 22171/head
Yan, Zheng [Tue, 8 May 2018 03:32:01 +0000 (11:32 +0800)]
mds: tighten conditions of calling rejoin_gather_finish()

Handle two cases:
1. mds receives all cache rejoin messages, then receives mdsmap that
   says mds cluster enters rejoining state.
2. when opening undef inodes/dirfrags, other mds restarts.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit 0a38a499b86c0ee13aa0e783a8359bcce0876088)

7 years agomds: avoid calling rejoin_gather_finish() two times successively
Yan, Zheng [Tue, 8 May 2018 02:42:05 +0000 (10:42 +0800)]
mds: avoid calling rejoin_gather_finish() two times successively

If MDCache::rejoin_gather is empty and MDCache::rejoins_pending is true
when MDCache::process_imported_caps() calls maybe_send_pending_rejoins()
Both MDCache::rejoin_send_rejoins() and MDCache::process_imported_caps()
may call rejoin_gather_finish().

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Fixes: http://tracker.ceph.com/issues/24047
(cherry picked from commit 0451dae777a2a9b1e70303d7bbc4398849f45f3e)

7 years agomds: properly reconnect client caps after loading inodes 21900/head
Yan, Zheng [Wed, 2 May 2018 02:23:33 +0000 (10:23 +0800)]
mds: properly reconnect client caps after loading inodes

Commit e43c02d6 "mds: filter out blacklisted clients when importing
caps" makes MDCache::process_imported_caps() ignore clients that are
not in MDCache::rejoin_imported_session_map. The map does not contain
clients from which mds has received reconnect messages. This causes
some client caps (corresponding inodes were not in cache when mds was
in reconnect state) to get dropped.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit 48f60e7f274de9d76499816a528eff859bb161e3)

Conflicts:
src/mds/MDCache.h: Resolved for rejoin_recovered_client and
rejoin_open_sessions_finish

7 years agomds: filter out blacklisted clients when importing caps
Yan, Zheng [Sun, 22 Apr 2018 09:46:28 +0000 (17:46 +0800)]
mds: filter out blacklisted clients when importing caps

The very first step of importing caps is calling
Server::prepare_force_open_sessions(). This patch makes the function
ignore blacklisted clients and return a session map for clients that
are not blacklisted. This patch also modify the codes that actually
do cap imports, make them skip caps for clients that are not in the
session map.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Fixes: http://tracker.ceph.com/issues/23518
(cherry picked from commit e43c02d6abd065dff413440a0f9c3d3f6653e87b)

Conflicts:
src/mds/MDCache.h: Resolved in rejoin_open_sessions_finish
src/mds/Migrator.cc : Resolved in handle_export_dir
and decode_import_inode_caps
src/mds/Server.cc : Resolved in _rename_prepare_import

7 years agomds: don't add blacklisted clients to reconnect gather set
Yan, Zheng [Fri, 20 Apr 2018 10:04:09 +0000 (18:04 +0800)]
mds: don't add blacklisted clients to reconnect gather set

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit 857e3edac5eec82f1d04caec440d3d6e61bbf679)

Conflicts:
src/mds/SessionMap.h: Removed get_client_set

7 years agomds: combine MDCache::{cap_exports,cap_export_targets}
Yan, Zheng [Sun, 22 Apr 2018 10:27:52 +0000 (18:27 +0800)]
mds: combine MDCache::{cap_exports,cap_export_targets}

this change saves a map lookup

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit ea72863b2b8ab5c387b70687300f6f7cff2019db)

7 years agoclient: avoid freeing inode when it contains TX buffer heads 22168/head
YunfeiGuan [Tue, 8 May 2018 11:35:32 +0000 (19:35 +0800)]
client: avoid freeing inode when it contains TX buffer heads

ObjectCacher::discard_set() prematurely delete TX buffer heads. But
the pending writebacks still pin parent objects of these buffer heads.
Assertion "oset.objects.empty()" gets triggered if inode with pending
writebacks get freed.

Fixes:http://tracker.ceph.com/issues/23837
Signed-off-by: Guan yunfei <yunfei.guan@xtaotech.com>
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit 8a03757ca0ab493c6c2ea4fa4307e053e8ebc944)

7 years agoosdc/ObjectCacher: allow discard to complete in-flight writeback
Jason Dillaman [Wed, 4 Apr 2018 15:47:05 +0000 (11:47 -0400)]
osdc/ObjectCacher: allow discard to complete in-flight writeback

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit feee52d57d6386a2d62580d1b7ca24ee56831b20)

7 years agotest: Whitelist corrections
David Zafman [Tue, 22 May 2018 15:37:22 +0000 (08:37 -0700)]
test: Whitelist corrections

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit ee4acb6e1ff7458ceaefdb288cbcb158c6a3bed3)

Add "erasure code profile property .ruleset-failure-domain. is no longer supported" for luminous

7 years agoMerge pull request #22134 from dzafman/wip-missed-backport
Josh Durgin [Tue, 22 May 2018 15:29:17 +0000 (08:29 -0700)]
Merge pull request #22134 from dzafman/wip-missed-backport

test: Add CACHE_POOL_NO_HIT_SET to whitelist for mon/pool_ops.sh

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
7 years agotest: Add CACHE_POOL_NO_HIT_SET to whitelist for mon/pool_ops.sh 22134/head
David Zafman [Sat, 19 May 2018 03:15:41 +0000 (20:15 -0700)]
test: Add CACHE_POOL_NO_HIT_SET to whitelist for mon/pool_ops.sh

Ignore
  cluster [WRN] Health check failed: 1 cache pools are missing hit_sets (CACHE_POOL_NO_HIT_SET)

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 4fad800043d44024a496f78869e9bb02a16af063)

7 years agoMerge pull request #22044 from dzafman/wip-24045-luminous
Josh Durgin [Mon, 21 May 2018 23:53:31 +0000 (16:53 -0700)]
Merge pull request #22044 from dzafman/wip-24045-luminous

luminous: osd: Don't evict even when preemption has restarted with smaller chunk

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
7 years agoMerge pull request #22131 from ceph/wip-yuriw-clients-fix-luminous
Josh Durgin [Mon, 21 May 2018 22:54:21 +0000 (15:54 -0700)]
Merge pull request #22131 from ceph/wip-yuriw-clients-fix-luminous

qa/tests: added supported distro

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
7 years agoMerge pull request #22128 from liewegas/wip-rbd-msgr-luminous
Jason Dillaman [Mon, 21 May 2018 20:02:18 +0000 (16:02 -0400)]
Merge pull request #22128 from liewegas/wip-rbd-msgr-luminous

luminous: qa/suites/rbd/basic/msgr-failures: remove many.yaml

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
7 years agoqa/suites/rbd/basic/msgr-failures: remove many.yaml 22128/head
Sage Weil [Mon, 21 May 2018 19:38:34 +0000 (14:38 -0500)]
qa/suites/rbd/basic/msgr-failures: remove many.yaml

Overkill, and triggers some failures, see
http://tracker.ceph.com/issues/23789

Removed in master by 4046f46d0e6a70d860d74945dfb95c2511394640

Fixes: http://tracker.ceph.com/issues/23789
Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoMerge pull request #21547 from VictorDenisov/backport
Yuri Weinstein [Mon, 21 May 2018 16:21:16 +0000 (09:21 -0700)]
Merge pull request #21547 from VictorDenisov/backport

luminous: tests: filestore journal replay does not guard omap operations

Reviewed-by: David Zafman <dzafman@redhat.com>
7 years agoMerge pull request #21515 from tchaikov/wip-luminous-pr-21469
Yuri Weinstein [Mon, 21 May 2018 16:20:30 +0000 (09:20 -0700)]
Merge pull request #21515 from tchaikov/wip-luminous-pr-21469

luminous: mon/LogMonitor: do not crash on log sub w/ no messages

Reviewed-by: David Zafman <dzafman@redhat.com>
7 years agoMerge pull request #21376 from pdvian/wip-23666-luminous
Yuri Weinstein [Mon, 21 May 2018 16:18:52 +0000 (09:18 -0700)]
Merge pull request #21376 from pdvian/wip-23666-luminous

luminous: msg/async/AsyncConnection: Fix FPE in process_connection

Reviewed-by: Kefu Chai <kchai@redhat.com>
7 years agoMerge pull request #21405 from pdvian/wip-23672-luminous
Yuri Weinstein [Mon, 21 May 2018 16:18:05 +0000 (09:18 -0700)]
Merge pull request #21405 from pdvian/wip-23672-luminous

luminous: os/bluestore: alter the allow_eio policy regarding kernel's error list.

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
7 years agoMerge pull request #21407 from tchaikov/wip-luminous-23246
Yuri Weinstein [Mon, 21 May 2018 16:17:04 +0000 (09:17 -0700)]
Merge pull request #21407 from tchaikov/wip-luminous-23246

luminous: os/bluestore: fix exceeding the max IO queue depth in KernelDevice.

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
7 years agoMerge pull request #21514 from smithfarm/wip-posix-zfs-luminous
Yuri Weinstein [Mon, 21 May 2018 16:15:04 +0000 (09:15 -0700)]
Merge pull request #21514 from smithfarm/wip-posix-zfs-luminous

luminous: common: posix_fallocate on ZFS returns EINVAL

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
Reviewed-by: Willem Jan Withagen <wjw@digiware.nl>
7 years agoMerge pull request #21818 from xiexingguo/wip-23925
Yuri Weinstein [Mon, 21 May 2018 16:12:56 +0000 (09:12 -0700)]
Merge pull request #21818 from xiexingguo/wip-23925

luminous: osd/OSDMap: check against cluster topology changing before applying pg upmaps

Reviewed-by: Sage Weil <sage@redhat.com>
7 years agomds: include nfiles/nsubdirs of directory inode in MClientCaps 22118/head
Yan, Zheng [Thu, 26 Apr 2018 07:12:48 +0000 (15:12 +0800)]
mds: include nfiles/nsubdirs of directory inode in MClientCaps

Directory inode's dirstat gets updated by request reply, but not by
cap message. This causes problem for following case.

1. MDS modifies a directory
2. MDS issues CEPH_CAP_ANY_SHARED to client
3. The client satifies stat(2) by its cached metadata.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Fixes: http://tracker.ceph.com/issues/23855
(cherry picked from commit ee2c628f6783954e9b25fab8ac9b572a58666a91)

Conflicts:
src/messages/MClientCaps.h: Resolved in encode_payload

7 years agoqa/tests: added supported distro 22094/head 22131/head
Yuri Weinstein [Fri, 18 May 2018 19:53:25 +0000 (12:53 -0700)]
qa/tests: added supported distro

Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
7 years agoMerge pull request #21575 from ceph/wip-cd-fix-pool-create
vasukulkarni [Fri, 18 May 2018 17:27:56 +0000 (10:27 -0700)]
Merge pull request #21575 from ceph/wip-cd-fix-pool-create

luminous: tests: ceph-deploy: create the rbd pool right after install

7 years agotest: Fix omap_digest changes in osd-scrub-repair.sh 22044/head
David Zafman [Fri, 18 May 2018 06:50:43 +0000 (23:50 -0700)]
test: Fix omap_digest changes in osd-scrub-repair.sh

Signed-off-by: David Zafman <dzafman@redhat.com>
7 years agotest: No more omap_digest being set
David Zafman [Fri, 18 May 2018 04:55:23 +0000 (21:55 -0700)]
test: No more omap_digest being set

Signed-off-by: David Zafman <dzafman@redhat.com>
7 years agotest: Luminous specifc changes
David Zafman [Fri, 18 May 2018 00:35:54 +0000 (17:35 -0700)]
test: Luminous specifc changes

*** Not sure why this wasn't seen earlier

Signed-off-by: David Zafman <dzafman@redhat.com>
7 years agotest: Need to escape parens in log-whitelist for grep
David Zafman [Fri, 18 May 2018 00:30:32 +0000 (17:30 -0700)]
test: Need to escape parens in log-whitelist for grep

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit a9e43ed85236c8412679da58d068253e80d21d05)

Conflicts:
qa/suites/rados/monthrash/ceph.yaml (no changes needed)

Additional changes for luminous:
qa/suites/rados/basic/tasks/rados_api_tests.yaml
qa/suites/rados/singleton/all/thrash-eio.yaml
qa/suites/smoke/basic/tasks/rados_api_tests.yaml

7 years agoosd: Clear part of cleaned_meta_map in case of a restarted smaller chunk
David Zafman [Wed, 16 May 2018 00:32:50 +0000 (17:32 -0700)]
osd: Clear part of cleaned_meta_map in case of a restarted smaller chunk

This can not happen at the primary because scrub_compare_maps() is only
called once per chunk start.

Preemption causes a smaller chunk from start to be processed again at
replicas.  We clear any of the previous chunk's information.

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 9e0ac797c602a088447679b04e14ec0cfaf9dd7b)

7 years agoosd: Don't evict even when preemption has restarted with smaller chunk
David Zafman [Thu, 10 May 2018 00:32:39 +0000 (17:32 -0700)]
osd: Don't evict even when preemption has restarted with smaller chunk

Fixes: https://tracker.ceph.com/issues/24045
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 818b59fa95ee60e86991276f18c4dee405dc79b1)

Conflicts:
src/osd/PG.h (trivial)

7 years agoosd/PrimaryLogPG: defer evict if head *or* object intersect scrub interval
Sage Weil [Tue, 24 Apr 2018 20:35:28 +0000 (15:35 -0500)]
osd/PrimaryLogPG: defer evict if head *or* object intersect scrub interval

Consider a scenario like:
- scrub [3:2525d100:::earlier:head,3:2525d12f:::foo:200]
 - we see 3:2525d12f:::foo:100 and include it in scrub map
- scrub [3:2525d12f:::foo:200, 3:2525dfff:::later:head]
- some op(s) that cause scrub to be preempted
- agent_work wants to evict 3:2525d12f:::foo:100
  - write_blocked_by_scrub sees scrub is preempted, returns false
  - 3:2525d12f:::foo:100 is removed, :head SnapSet is updated
- scrub rescrubs [3:2525d12f:::foo:200, 3:2525dfff:::later:head]
  - includes (updated) :head SnapSet
  - issues error like "3:2525d12f:::foo:100 is an unexpected clone"

Fix the problem by checking if anything part of the object-to-evict and
its head touch the scrub range; if so, back off.  Do not let eviction
preempt scrub; we can come back and do it later.

Fixes: http://tracker.ceph.com/issues/23646
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit c20a95b0b9f4082dcebb339135683b91fe39ec0a)

7 years agoosd: If ending on a head object get all of meta map
David Zafman [Sat, 28 Apr 2018 22:44:06 +0000 (15:44 -0700)]
osd: If ending on a head object get all of meta map

When ending on a head object, the head and snapshots would stay in
cleaned_meta_map until more maps arrive.  The problem as that
during a scrub an eviction could occur because scrubber.start
is already past the stray object(s) so range_intersects_scrub() is false.

Fixes: http://tracker.ceph.com/issues/23909
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 83861a5b75ddb98366f1ec106487b88703f25cf7)

7 years agotest: Add test cases for multiple copy pool and snapshot errors
David Zafman [Wed, 25 Apr 2018 22:19:57 +0000 (15:19 -0700)]
test: Add test cases for multiple copy pool and snapshot errors

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 1a7fa9a62a62a35c645757287917101925044df1)

7 years agotest: Fix comment at end of scrub test scripts
David Zafman [Wed, 25 Apr 2018 22:15:50 +0000 (15:15 -0700)]
test: Fix comment at end of scrub test scripts

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit bae4940574fa0ee267e40785c88ee6baa3fba96b)

7 years agotest: Prepare for second test and minor improvements
David Zafman [Fri, 20 Apr 2018 22:56:36 +0000 (15:56 -0700)]
test: Prepare for second test and minor improvements

Check list-inconsistent-obj output
Check how many _scan_snap groupings
Use more general check for crashed osd(s)

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 2fa596dc0c515b757bce3bd3089a2ed32304d976)

7 years agoosd: process _scan_snaps() with all snapshots with head
David Zafman [Fri, 20 Apr 2018 19:19:56 +0000 (12:19 -0700)]
osd: process _scan_snaps() with all snapshots with head

Fixes: http://tracker.ceph.com/issues/22881
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 8f0514bf59bad486df63d078b57df636eb969bc5)

Conflicts:
src/osd/PG.cc (trivial)

7 years agoosd/PG: kill extra scrubber state transition
xie xingguo [Fri, 23 Feb 2018 05:49:43 +0000 (13:49 +0800)]
osd/PG: kill extra scrubber state transition

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 323dca0c82b710766ece06da8efe8d99cf3c07ab)

7 years agoosd/PG: decay scrub_chunk_max too if scrub is preempted
xie xingguo [Fri, 23 Feb 2018 03:39:13 +0000 (11:39 +0800)]
osd/PG: decay scrub_chunk_max too if scrub is preempted

In normal case we'll at least scrub as many objects as
osd_scrub_chunk_max specified at a time, so the current
backoff mechanism should have very limit effect.
Decay both osd_scrub_chunk_min and osd_scrub_chunk_max
should instead be a better resolution.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit a9260524676ac28742e5a945de93b87ae985017e)

7 years agoosd/ReplicatedBackend: turn more be_deep_scrub options into legacy
xie xingguo [Thu, 22 Feb 2018 09:27:14 +0000 (17:27 +0800)]
osd/ReplicatedBackend: turn more be_deep_scrub options into legacy

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 946b6dde76e513af3e28a8725c873c414f4ad40b)

7 years agoosd/ReplicatedBackend: turn be_deep_scrub options into legacy
xie xingguo [Thu, 22 Feb 2018 08:53:49 +0000 (16:53 +0800)]
osd/ReplicatedBackend: turn be_deep_scrub options into legacy

See 588f0643f12ac842ff68cacd4d10d57f9f3ed3fe.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 048e638e3335c32e767f0767c8aa64eedfb675db)

7 years agoosd/ECBackend: inject sleep during deep scrub
xie xingguo [Thu, 22 Feb 2018 08:16:37 +0000 (16:16 +0800)]
osd/ECBackend: inject sleep during deep scrub

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 3cec7bfd819deea609a2996b2a9d118968fa6128)

7 years agoosd/PG: pass scrub priority to replica
Sage Weil [Mon, 5 Feb 2018 13:10:54 +0000 (07:10 -0600)]
osd/PG: pass scrub priority to replica

If we are scrubbing with high priority on the primary, pass that along
to the replica so that it can schedule its scrub work accordingly.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit d9fd07696058cf79a62c327ecf08a5f8fb5b6a28)

- fixed encode vs ::encode conflict

7 years agoosd/ReplicatedBackend: 'osd_deep_scrub_keys' doesn't work
fang yuxiang [Thu, 1 Feb 2018 06:17:17 +0000 (14:17 +0800)]
osd/ReplicatedBackend: 'osd_deep_scrub_keys' doesn't work

Signed-off-by: fang yuxiang <fang.yuxiang@eisoo.com>
(cherry picked from commit ad6039bbab42137b748d2377fb402e31f4e0dcfe)

7 years agoosd/osd_types.h: default to no data/omap digest for new object
xie xingguo [Wed, 6 Sep 2017 02:25:02 +0000 (10:25 +0800)]
osd/osd_types.h: default to no data/omap digest for new object

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 5345afa5d5c984a173fcf6d5a4447b71eb864070)

7 years agoosd/PG: drop 'seed' property from Scrubber
Sage Weil [Fri, 19 Jan 2018 19:59:56 +0000 (13:59 -0600)]
osd/PG: drop 'seed' property from Scrubber

This has been -1 for many releases now.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 2d34e380c8df465bfb3968fd10550285faa4a9b9)

Conflicts:
src/messages/MOSDRepScrub.h

- encode vs ::encode etc

7 years agoqa/suites/rados/singleton/all/divergent_priors*: unsquelch osd debug
Sage Weil [Wed, 3 Jan 2018 20:29:55 +0000 (14:29 -0600)]
qa/suites/rados/singleton/all/divergent_priors*: unsquelch osd debug

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 5ac3bfa34c40d2beb79ad189e6a98033b981e75c)

7 years agoosd/ECBackend: debug ec scrub error paths
Sage Weil [Wed, 3 Jan 2018 20:19:35 +0000 (14:19 -0600)]
osd/ECBackend: debug ec scrub error paths

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit a188cb27dd0d458362181205915bf17df61595e6)

7 years agoosd: document scrub options
Sage Weil [Thu, 28 Dec 2017 23:27:43 +0000 (17:27 -0600)]
osd: document scrub options

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 4e0f4238b92ac212f941592745978620a1967cd2)

7 years agoosd: allow limited scrub preemption
Sage Weil [Fri, 19 Jan 2018 17:29:19 +0000 (11:29 -0600)]
osd: allow limited scrub preemption

If we receive a write within the scrub range, abort the scrub chunk and
shrink the chunk size.  If we do this too many times do not preempt and
allow the scrub to complete (to avoid scrub starvation due to client io).

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 6dd42392c0f00011059ffa5de74cace7d1e911bd)

Conflicts:
src/messages/MOSDRepScrub.h
src/messages/MOSDRepScrubMap.h
src/osd/PrimaryLogPG.cc
src/osd/PrimaryLogPG.h

- encode vs ::encode etc
- dragged in waiting for scrub events from 508ea640e3b
- ignore change in chunked manifest code (which dne in luminous)

7 years agoosd: piecewise scrub
Sage Weil [Fri, 19 Jan 2018 17:20:06 +0000 (11:20 -0600)]
osd: piecewise scrub

Perform scrub in stages, with each unit of work requeuing an item in the
work queue.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit bf16f59887d6b7624112212cecead3ebec48b6f9)

Conflicts:
src/osd/PG.cc
src/osd/ReplicatedBackend.cc

- encode -> ::encode

7 years agoosd: flush before collection_list()
Sage Weil [Mon, 16 Oct 2017 15:47:39 +0000 (10:47 -0500)]
osd: flush before collection_list()

We would get this implicitly with FileStore if we waited for the onreadable
callbacks, but in some cases the OSD has already done that.  With BlueStore,
we need to explicitly flush().

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit b877860e4246cdd21b5ee79f17756efcf71b311e)

7 years agoosd/ECBackend: turn be_deep_scrub options into legacy
Sage Weil [Tue, 12 Dec 2017 16:50:16 +0000 (10:50 -0600)]
osd/ECBackend: turn be_deep_scrub options into legacy

We don't have a lightweight mechanism for doing trivial config options
that is better than legacy_config_opts.h yet.  Until then,

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 588f0643f12ac842ff68cacd4d10d57f9f3ed3fe)

7 years agoqa/tasks/ceph: disable osd_debug_deep_scrub_sleep in case it is set
Sage Weil [Fri, 17 Nov 2017 16:20:40 +0000 (10:20 -0600)]
qa/tasks/ceph: disable osd_debug_deep_scrub_sleep in case it is set

Otherwise the final scrub may take too long.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 3f922e79c3c39710f5fbabc0dacef5f4ab19885b)

7 years agoosd/*Backend: debug: inject sleep during deep scrub
Sage Weil [Thu, 16 Nov 2017 14:58:01 +0000 (08:58 -0600)]
osd/*Backend: debug: inject sleep during deep scrub

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 3e66d88f308af2a9bd3410f7476af342acf48b91)

7 years agoosd/PG: drop waiting_on, use waiting_on_whom
Sage Weil [Thu, 16 Nov 2017 14:57:13 +0000 (08:57 -0600)]
osd/PG: drop waiting_on, use waiting_on_whom

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 004ee202fac7a2f3fba2b018426474eeae7f913b)

Add changes to PG::sub_op_scrub_map() which exists in Luminous

7 years agoosd/PrimaryLogPG: do not generate data digest for BlueStore by default
xie xingguo [Tue, 5 Sep 2017 12:56:32 +0000 (20:56 +0800)]
osd/PrimaryLogPG: do not generate data digest for BlueStore by default

BlueStore enables CRC by default, so this is a dup and gains
no more benefits.

Turn this off by default, which is good for performance.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit afcb617dc9791aa6551a1856c89b3e8e2648eabc)

Conflicts:
../qa/standalone/scrub/osd-scrub-repair.sh (Modify json object info instead of string)

7 years agoosd/PrimaryLogPG: add condition "is_chunky_scrub_active" to check object in chunky_scrub.
Jianpeng Ma [Tue, 24 Oct 2017 14:07:18 +0000 (22:07 +0800)]
osd/PrimaryLogPG: add condition "is_chunky_scrub_active" to check object in chunky_scrub.

Avoid every time call scrubber.write_block_by_scrub. Most time scrubber
is inactive. And compare to write_block_by_scrub, is_chunky_scrub_active
is light.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
(cherry picked from commit 6c81c9bb0979c101c112e8ccd45880e08bfdb945)

7 years agoosd: add scrub week day constraint
kungf [Tue, 17 Oct 2017 14:40:43 +0000 (22:40 +0800)]
osd: add scrub week day constraint

if add week day constraint, we can set scrub permit time
more flexible. eg. we can set scurb in Monday-Wednesday 0-12 o'clock
according set this parameter:
osd_scrub_begin_week_day = 1
osd_scrub_end_week_day = 3
osd_scrub_begin_hour = 0
osd_scrub_end_hour = 12

Signed-off-by: kungf <yang.wang@easystack.cn>
(cherry picked from commit 87be7c70a17492c9e5f06e01722690acec7a2c51)