]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
Rishabh Dave [Tue, 3 Sep 2019 13:06:23 +0000 (18:36 +0530)]
api/lvm: rewrite a condition
Create the list of logical volumes if the list passed in arguments is
empty and rewrite the condition to make it more readable.
Fixes: https://tracker.ceph.com/issues/41649
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit
d1f1bfd3635501090f4069be59e0bcde94dd64ec )
Yuri Weinstein [Tue, 29 Oct 2019 16:35:10 +0000 (09:35 -0700)]
Merge pull request #30225 from dzafman/wip-network-mimic
mimic: core: Health warnings on long network ping times
Reviewed-by: Neha Ojha <nojha@redhat.com>
xie xingguo [Wed, 26 Jun 2019 06:24:08 +0000 (14:24 +0800)]
osd/OSD: auto mark heartbeat sessions as stale and tear them down
The primary benefit is that the OSD doesn't need to keep a flood of
blocked heartbeat messages around in memory.
This prevents OSDs from accumulating heartbeat messages due to a
broken switch and then exhausting the whole node's memory:
Jun 11 04:19:26 host-192-168-9-12 kernel: [409881.137077] Out of memory:
Kill process
1471476 (ceph-osd) score 47 or sacrifice child
Jun 11 04:19:26 host-192-168-9-12 kernel: [409881.146054] Killed process
1471476 (ceph-osd) total-vm:4822548kB, anon-rss:3097860kB,
file-rss:2556kB, shmem-rss:0kB
Fixes: http://tracker.ceph.com/issues/40586
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit
6cc90f363b8096d2d5fad30e57426d0cea9e3478 )
Conflicts:
src/osd/OSD.cc (no boot_finisher.stop() and no lock_guard)
src/osd/OSD.h (trivial)
Fixed get_val() call in reset_heartbeat_peers()
Yuri Weinstein [Wed, 23 Oct 2019 15:32:06 +0000 (08:32 -0700)]
Merge pull request #28585 from ukernel/mimic-40327
mimic: mds: change how mds revoke stale caps
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Yuri Weinstein [Wed, 23 Oct 2019 15:31:21 +0000 (08:31 -0700)]
Merge pull request #30841 from smithfarm/wip-42263-mimic
mimic: tests: do not take ceph.conf.template from ceph/teuthology.git
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Yuri Weinstein [Wed, 23 Oct 2019 15:28:47 +0000 (08:28 -0700)]
Merge pull request #30918 from smithfarm/wip-42122-mimic
mimic: cephfs: client: add procession of SEEK_HOLE and SEEK_DATA in lseek.
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Yuri Weinstein [Wed, 23 Oct 2019 15:28:14 +0000 (08:28 -0700)]
Merge pull request #30979 from smithfarm/wip-41464-mimic
mimic: tools: ceph-objectstore-tool: update-mon-db: do not fail if incmap is missing
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Yuri Weinstein [Wed, 23 Oct 2019 15:27:29 +0000 (08:27 -0700)]
Merge pull request #31017 from smithfarm/wip-40896-mimic-revert
mimic: cephfs: Revert "ceph_volume_client: convert string to bytes object"
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Yuri Weinstein [Wed, 23 Oct 2019 15:25:53 +0000 (08:25 -0700)]
Merge pull request #29219 from smithfarm/wip-38875-mimic
mimic: mds: high debug logging with many subtrees is slow
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Yuri Weinstein [Wed, 23 Oct 2019 15:25:24 +0000 (08:25 -0700)]
Merge pull request #30932 from smithfarm/wip-42034-mimic
mimic: cephfs: client: EINVAL may be returned when offset is 0.
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Yuri Weinstein [Wed, 23 Oct 2019 15:24:59 +0000 (08:24 -0700)]
Merge pull request #30933 from smithfarm/wip-42038-mimic
mimic: cephfs: client: _readdir_cache_cb() may use the readdir_cache already clear
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Nathan Cutler [Wed, 23 Oct 2019 15:10:09 +0000 (17:10 +0200)]
Merge pull request #31090 from smithfarm/wip-42416-mimic
mimic: doc/rbd: s/guess/xml/ for codeblock lexer
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Kefu Chai [Wed, 16 Oct 2019 04:34:19 +0000 (12:34 +0800)]
doc/rbd: s/guess/xml/ for codeblock lexer
this change silences the warning of
```
doc/rbd/qemu-rbd.rst:174: WARNING: Pygments lexer name 'guess' is not
known
```
see http://pygments.org/docs/lexers/, we should use "xml" for XML .
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit
df226da996e468d2707b08eb012d54b4e37ffdc6 )
Yuri Weinstein [Tue, 22 Oct 2019 18:41:52 +0000 (11:41 -0700)]
Merge pull request #30775 from smithfarm/wip-41979-mimic
mimic: rgw: fix list versions starts with version_id=null
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Tue, 22 Oct 2019 18:41:26 +0000 (11:41 -0700)]
Merge pull request #30868 from smithfarm/wip-41324-mimic
mimic: rgw: datalog/mdlog trim commands loop until done
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Tue, 22 Oct 2019 18:40:59 +0000 (11:40 -0700)]
Merge pull request #30980 from smithfarm/wip-41496-mimic
mimic: rgw: fix the bug of rgw not doing necessary checking to website configuration
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Tue, 22 Oct 2019 15:05:57 +0000 (08:05 -0700)]
Merge pull request #30891 from smithfarm/wip-41715-mimic
mimic: rgw: fix refcount tags to match and update object's idtag
Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
Yuri Weinstein [Tue, 22 Oct 2019 15:05:09 +0000 (08:05 -0700)]
Merge pull request #30977 from theanalyst/wip-41570-mimic
mimic: rgw: asio: check the remote endpoint before processing requests
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
Sage Weil [Fri, 10 May 2019 19:45:22 +0000 (14:45 -0500)]
qa/tasks/ceph.conf: do not warn on TOO_FEW_OSDS
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
0483c1c3e7ffdfa6a6f65c5ef000c45d2f096428 )
Yuri Weinstein [Mon, 21 Oct 2019 23:53:42 +0000 (16:53 -0700)]
Merge pull request #30713 from smithfarm/wip-40258-mimic
mimic: cmake: detect armv8 crc and crypto feature using CHECK_C_COMPILER_FLAG
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Yuri Weinstein [Mon, 21 Oct 2019 23:52:41 +0000 (16:52 -0700)]
Merge pull request #30893 from smithfarm/wip-41964-mimic
mimic: tools/rados: list objects in a pg
Reviewed-by: Vikhyat Umrao <vikhyat@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Yuri Weinstein [Mon, 21 Oct 2019 23:51:55 +0000 (16:51 -0700)]
Merge pull request #30898 from smithfarm/wip-42128-mimic
mimic: osd/OSDMap: do not trust partially simplified pg_upmap_item
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Yuri Weinstein [Mon, 21 Oct 2019 23:51:30 +0000 (16:51 -0700)]
Merge pull request #30903 from smithfarm/wip-42154-mimic
mimic: mon/OSDMonitor: trim not-longer-exist failure reporters
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Yuri Weinstein [Mon, 21 Oct 2019 23:50:40 +0000 (16:50 -0700)]
Merge pull request #30924 from vumrao/wip-vumrao-42240
mimic: osd/PG: Add PG to large omap log message
Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Yuri Weinstein [Mon, 21 Oct 2019 23:48:02 +0000 (16:48 -0700)]
Merge pull request #30846 from wido/mimic-42116
mimic: mgr/telemetry: Ignore crashes in report when module not enabled
Reviewed-by: Sage Weil <sage@redhat.com>
Yuri Weinstein [Mon, 21 Oct 2019 23:47:22 +0000 (16:47 -0700)]
Merge pull request #30895 from smithfarm/wip-42036-mimic
mimic: osd/PeeringState: recover_got - add special handler for empty log
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Yuri Weinstein [Mon, 21 Oct 2019 23:46:52 +0000 (16:46 -0700)]
Merge pull request #30901 from smithfarm/wip-42137-mimic
mimic: osd: Remove unused osdmap flags full, nearfull from output
Reviewed-by: David Zafman <dzafman@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Yuri Weinstein [Mon, 21 Oct 2019 23:45:59 +0000 (16:45 -0700)]
Merge pull request #30916 from smithfarm/wip-41457-mimic
mimic: osd: merge replica log on primary need according to replica log's crt
Reviewed-by: Neha Ojha <nojha@redhat.com>
Yuri Weinstein [Mon, 21 Oct 2019 23:45:30 +0000 (16:45 -0700)]
Merge pull request #30982 from tchaikov/wip-mimic-42362
mimic: build/ops: python3-cephfs should provide python36-cephfs
Reviewed-by: Nathan Cutler <ncutler@suse.com>
Yuri Weinstein [Mon, 21 Oct 2019 23:44:32 +0000 (16:44 -0700)]
Merge pull request #30991 from smithfarm/wip-37520-mimic-revert
mimic: msg: Revert "msg/async: do not trigger RESETSESSION from connect fault during connection phase"
Reviewed-by: Ricardo Dias <rdias@suse.com>
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Nathan Cutler [Mon, 21 Oct 2019 12:02:08 +0000 (14:02 +0200)]
Revert "ceph_volume_client: make UTF-8 encoding explicit"
This reverts commit
ddb8cfa072cc19fb6cb61128b4fe4b8ffe1a1742 .
Signed-off-by: Nathan Cutler <ncutler@suse.com>
Nathan Cutler [Mon, 21 Oct 2019 12:02:03 +0000 (14:02 +0200)]
Revert "pybind/ceph_volume_client: allow volume_client to"
This reverts commit
1ab18fcf056ef8e4fcb9ed27e64ee4f8a75866dc .
Signed-off-by: Nathan Cutler <ncutler@suse.com>
Nathan Cutler [Mon, 21 Oct 2019 12:01:55 +0000 (14:01 +0200)]
Revert "ceph_volume_client: convert string to bytes object"
This reverts commit
e081e97190b58d3a09377275f6d484bf9020fe51 .
Signed-off-by: Nathan Cutler <ncutler@suse.com>
Nathan Cutler [Mon, 21 Oct 2019 12:01:48 +0000 (14:01 +0200)]
Revert "ceph_volume_client: don't convert None to str object"
This reverts commit
e085982def42738d181240c7b6e97b30f6bc0cf8 .
Signed-off-by: Nathan Cutler <ncutler@suse.com>
Yan, Zheng [Wed, 19 Jun 2019 06:39:55 +0000 (14:39 +0800)]
mds: check last laggy before marking unresponsive client stale
Current mds may evict unresponsive client without going through session
stale. So we need to adjust the last laggy check.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit
e5cc6f52feb64fa1c7ba1ee2b304cd6588f16e7a )
Conflicts:
src/mds/Server.cc
Yan, Zheng [Wed, 19 Jun 2019 03:42:05 +0000 (11:42 +0800)]
mds: remove the code that skip evicting the only client
There is already logic that defer marking unresponsive client stale.
No reason to defer evicting the only stale client.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit
cd29206974427a4f6ab410b1482bbd8ebfb55fbd )
Conflicts:
qa/tasks/cephfs/test_misc.py
Yan, Zheng [Tue, 5 Mar 2019 09:40:08 +0000 (17:40 +0800)]
qa/cephfs: update tests for stale session handling
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit
1c8be588e32f47ca712561711ad1ffdddc54b330 )
Conflicts:
qa/tasks/cephfs/test_client_recovery.py
Yan, Zheng [Wed, 27 Feb 2019 12:51:38 +0000 (20:51 +0800)]
mds: change how mds revoke stale caps
- Only revokes conflicting caps from stale client.
- If stale client holds conflicting CEPH_CAP_ANY_WR,
blacklist and kill it.
Fixes: https://tracker.ceph.com/issues/38326
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit
dcae1ea2d30398f7b6493a74b482e964a21fcfeb )
Conflicts:
src/mds/CInode.cc
src/mds/Capability.cc
src/mds/Locker.cc
src/mds/MDSRank.h
src/mds/Server.cc
Yan, Zheng [Mon, 17 Jun 2019 04:58:58 +0000 (12:58 +0800)]
mds: don't mark unresponsive sessions holding no caps stale
When an unresponsive MDS session holds no caps, do not mark it stale
even after session_timeout; at session_autoclose, evict it directly.
Fixes: http://tracker.ceph.com/issues/17854
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit
98af31d10f362c05ea8ed57495973b08599431e7 )
Conflicts:
src/mds/Server.cc
Yan, Zheng [Mon, 10 Dec 2018 03:37:32 +0000 (11:37 +0800)]
mds: optimize resuming stale caps
If client doesn't want any cap, there is no need to re-issue stale
caps.
A special case is that client wants some caps, but skipped updating
'wanted'. For this case, client needs to update 'wanted' when stale
session get renewed.
Fixes: http://tracker.ceph.com/issues/38043
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit
e824b3d2024db36789fdf579f0af9bf3bbe55d51 )
Conflicts:
src/client/Client.cc
src/mds/cephfs_features.h
Yan, Zheng [Thu, 6 Dec 2018 09:22:25 +0000 (17:22 +0800)]
client: avoid unnecessary wakeup when handling RENEWCAPS
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit
c744bc1673ca0b4e50f41516fdf49c8560db073a )
Yan, Zheng [Thu, 6 Dec 2018 09:16:49 +0000 (17:16 +0800)]
client: don't wakeup cap waiters twice when mds recovered
Both kick_maxsize_requests() and wake_inode_waiters() wake up cap
waiters
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit
5993a93e606d23ef311831863444294d7c64cc04 )
Conflicts:
src/client/Client.cc
src/client/Client.h
Yan, Zheng [Thu, 6 Dec 2018 07:30:44 +0000 (15:30 +0800)]
client: set cap->wanted when adding new cap
This avoids unnecessary cap message if cap is added by open/create
request reply.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit
314660a46a8a1af97f70b2bac05b2f6fa5d23bc4 )
Conflicts:
src/client/Client.h
Yan, Zheng [Fri, 23 Nov 2018 08:19:52 +0000 (16:19 +0800)]
mds: optimize revoking stale caps
For caps that are not being revoked and don't have writeable range
and don't want exclusive caps or file read/write. there is no need
to call Locker::revoke_stale_caps(Capability*). Because these caps
don't need recover and don't affect eval_gather()/try_eval().
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit
cb62030f0468fc04735c1b4cff73da779cb11ad8 )
Conflicts:
src/mds/CInode.cc
src/mds/Capability.h
src/mds/Locker.cc
src/mds/Migrator.cc
Yan, Zheng [Thu, 22 Nov 2018 09:28:15 +0000 (17:28 +0800)]
mds: put notable caps at the front of session's caps list
Notable Capabilities are ones that are being revoked, ones that
have writeable ranges and ones that want exclusive caps or want
file read/write.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit
cb6e7184458e64720ad83c266e5e393a80c32697 )
Yan, Zheng [Wed, 21 Nov 2018 12:22:25 +0000 (20:22 +0800)]
mds: track if client has writeable range in Capability
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit
370ae1cb3e5dc07867d80e998082bc514e8fccfd )
Conflicts:
src/mds/Locker.cc
src/mds/MDCache.h
src/mds/Server.cc
Yan, Zheng [Mon, 26 Nov 2018 01:44:56 +0000 (09:44 +0800)]
mds: add session pointer to Capability
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit
7c628472a86c6acebb20f0a2504744b10f250587 )
Conflicts:
src/mds/CInode.cc
src/mds/Capability.h
src/mds/Locker.cc
Yan, Zheng [Thu, 22 Nov 2018 07:55:12 +0000 (15:55 +0800)]
client: sync 'retain caps' logical from kernel client
The main change is keeping CEPH_CAP_ANY_RD for unused file inodes
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit
29034396398b1706db625e1abf2e6682be7130a5 )
Yan, Zheng [Thu, 22 Nov 2018 07:02:36 +0000 (15:02 +0800)]
client: skip updating 'wanted' caps if caps are already issued
When reading cached inode that already has Fscr caps, this can avoid
two cap messages (one updats 'wanted' caps, one clears 'wanted' caps).
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit
d20b260ecf3f323c87ad1e6865f87e2381444546 )
David Zafman [Fri, 26 Jul 2019 05:23:21 +0000 (22:23 -0700)]
osd mon mgr: Changes for rebase and correction for this branch
Fix use of asok_command() which doesn't do try/catch
Need unregister_command() since unregister_commands() doesn't exist here
Use Mutex::locker since lock_guard() isn't available
Use new g_conf which isn't g_conf() anymore
cct->_conf is a pointer now
Use ceph_abort() because cct isn't available for ceph_abort_msg()
Signed-off-by: David Zafman <dzafman@redhat.com>
David Zafman [Thu, 3 Oct 2019 16:09:10 +0000 (09:09 -0700)]
test: Ignore OSD_SLOW_PING_TIME* if injecting socket failures
Fixes: https://tracker.ceph.com/issues/41743
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
ded58ef91d6c8a68de49fa2c6b6e01636515c59b )
Conflicts: 3 yamls don't exist in Mimic
David Zafman [Fri, 6 Sep 2019 18:20:10 +0000 (11:20 -0700)]
test: Allow fractional milliseconds to make test possible
Fixes: https://tracker.ceph.com/issues/41689
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
6d2e4cb109caff8dae5e5e18563b6305131b488b )
David Zafman [Wed, 4 Sep 2019 18:38:09 +0000 (18:38 +0000)]
doc: Document network performance monitoring
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
71015b94abdf669695754a598a05a4a1c5d46f83 )
Conflicts:
doc/rados/operations/monitoring.rst (trivial)
David Zafman [Wed, 4 Sep 2019 17:13:32 +0000 (17:13 +0000)]
osd doc mon mgr: To milliseconds for config value, user input and threshold out
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
5f83a6158b29944cf8f5a069c50edba3e172cdcc )
Conflicts:
src/common/options.cc (trivial)
David Zafman [Tue, 6 Aug 2019 03:57:48 +0000 (20:57 -0700)]
osd mon mgr: Convert all network ping time output to milliseconds
To output milliseconds (usec / 1000), treat as fixed point integers
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
9d02e5d39d7b5e2806a5d98bdde24f4584e70528 )
Conflicts:
src/mon/PGMap.cc (trivial)
David Zafman [Fri, 9 Aug 2019 01:06:43 +0000 (18:06 -0700)]
common: Add support routines to generate strings for fixed point
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
8ac1562b4988fc3d52f92f15eb58075de0bcf27e )
Conflicts:
src/common/Formatter.h (trivial)
David Zafman [Sat, 13 Jul 2019 02:35:04 +0000 (19:35 -0700)]
test: Add basic test for network ping tracking
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
4fb42ea27e7b6acefd081b7b287d38347a6085ce )
David Zafman [Wed, 24 Jul 2019 21:19:43 +0000 (14:19 -0700)]
osd: Add debug_heartbeat_testing_span to allow quicker testing
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
573aea2bb1d48237df5182a6e4421e15c1eea88c )
David Zafman [Wed, 24 Jul 2019 01:10:46 +0000 (18:10 -0700)]
osd: Add debug_disable_randomized_ping config for use in testing
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
f2b26d88f0a0727f0362ccd8b287f8bb3f41dc3c )
Conflicts:
src/osd/OSD.cc (trivial)
src/common/options.cc (trivial)
David Zafman [Mon, 22 Jul 2019 18:52:41 +0000 (11:52 -0700)]
osd mgr: Add osd_mon_heartbeat_stat_stale option to time out ping info
after 1 hour
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
048f8096265dd3a647adb970255e4b11c9617b2e )
Conflicts:
src/osd/OSD.cc (trivial)
David Zafman [Fri, 19 Jul 2019 04:29:49 +0000 (21:29 -0700)]
mon: Indicate when an osd with slow ping time is down
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
5ab145d6402a2525d69296de95b36214bc4c7431 )
David Zafman [Fri, 19 Jul 2019 04:28:16 +0000 (21:28 -0700)]
osd mon: Add last_update to osd_stat_t heartbeat info
Ignore old heartbeat info which hasn't updated
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
ea20d3522aaf644cef989c565e11dd781e420e18 )
Conflicts:
src/osd/osd_types.h (osd_stat_t location in file changed)
David Zafman [Tue, 16 Jul 2019 19:02:43 +0000 (12:02 -0700)]
osd: After first interval populate vectors so 5min/15min values aren't 0
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
6555699d289769a44e9840424192a1be1a6ba00d )
David Zafman [Mon, 15 Jul 2019 20:23:53 +0000 (13:23 -0700)]
osd mgr: Store last pingtime for possible graphing
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
3f846d7c806b7f62ead08f0e9fb2ba927ffe0592 )
Conflicts:
src/osd/osd_types.h (osd_stat_t location in file changed)
David Zafman [Fri, 12 Jul 2019 01:06:23 +0000 (01:06 +0000)]
osd mgr: Add minimum and maximum tracking to network ping time
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
297a0e7b1de410c094fc9a6e42be14813d6dac5e )
Conflicts:
src/osd/osd_types.cc (trivial)
src/osd/osd_types.h (osd_stat_t location in file changed)
David Zafman [Thu, 11 Jul 2019 00:05:47 +0000 (00:05 +0000)]
doc: Add documentation and release notes
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
f4a0be2e8707f921d65bf22a6c1090e402905ad3 )
Conflicts:
PendingReleaseNotes (trivial)
David Zafman [Thu, 11 Jul 2019 21:24:12 +0000 (21:24 +0000)]
osd mgr mon: Add mon_warn_on_slow_ping_ratio config as 5% of osd_heartbeat_grace
Compute network ping threshold based on ratio (5% of 20 seconds is 1 second)
Make the threshold value used part of dump_osd_network for osd and mgr
Keep mon_warn_on_slow_ping_time (default 0) to optionally override the ratio
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
0d1bbd34e96e2da2027861229b376805d5ea8aa6 )
David Zafman [Tue, 9 Jul 2019 17:22:12 +0000 (17:22 +0000)]
mgr: Add "dump_osd_network" mgr admin request to get a sorted report
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
5d3c1856415f8b66e31361a0a7b9c75edc46e49e )
Conflicts:
src/mgr/ClusterState.cc (trivial)
src/mgr/ClusterState.h (trivial
David Zafman [Wed, 10 Jul 2019 18:15:44 +0000 (18:15 +0000)]
osd: Add "dump_osd_network" osd admin request to get a sorted report
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
025b10a5329127734367a6899543f51cd8580d43 )
Conflicts:
src/osd/OSD.cc (trivial)
David Zafman [Wed, 26 Jun 2019 02:59:06 +0000 (02:59 +0000)]
osd mon: Track heartbeat ping times and report health warning
Fixes: http://tracker.ceph.com/issues/40640
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
66d44e7f911a57100d650ad7df9445f88ec70140 )
Conflicts:
src/common/options.cc (trivial)
src/mon/PGMap.cc (trivial)
src/osd/OSD.cc (trivial)
src/osd/OSD.h (trivial)
src/osd/osd_types.cc (encode version difference)
src/osd/osd_types.h (osd_stat_t location in file changed)
src/mon/PGMap.cc manually get rid of extra argument to checks->add
src/osd/OSD.cc rename ping_stamp to stamp for backport
xie xingguo [Mon, 8 Jan 2018 07:02:58 +0000 (15:02 +0800)]
osd/OSD: fix HeartbeatInfo.is_healthy() check
Delay to declared to be healthy until we have received the first
replies from both front and back connections.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit
d9123158d1fef329fb9bf5ff787f9c84bb51b44c )
xie xingguo [Mon, 8 Jan 2018 02:24:09 +0000 (10:24 +0800)]
osd/OSD: use first_tx to calculate failed_for
If we never hear any replies from a heartbeat peer, use first_tx
to calculdate failed_for, which is more accurate.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit
aba603736cbce94f7e1e5ac851ae4d4f43ea63e6 )
xie xingguo [Mon, 16 May 2016 05:50:28 +0000 (13:50 +0800)]
osd: refactor heartbeat health check
The original logic will reuse the timestamp which we send pings to
the specific heartbeat peer to update the last_rx_front[back] field
on receiving the corresponding replies, which later shall be honoured
as the exact time we succeed in getting the corresponding replies and
is used to calculate the heartbeat latency and determine whether the
relevant peer is dead.
However this is not accurate enough as there may be a delay between
we receive a reply and call heartbeat_check(). We can eliminate
the delay by introducing a map to track the ping-history here,
each entry of which consists of three elements:
1. "tx_time", worked as the map key, indicates the exact timestamp
we send pings.
2. "deadline", indicates we shall receive all replies by then,
otherwise we consider this peer as "dead".
3. "unacknowledged", indicates how many pings for the corresponding
ping are still unacknowledged. The initial value is 2(as we send
two pings from the front and back side for each peer).
We insert an item into the map on every time we sending out a ping, and
decrease the "unacknowledged" counter by 1 each time we get a reply from
the tracked ping. If "unacknowledged" drops to 0, we know all the replies
have been successfully collected and we can safely erase the relevant
item from the map as well as the earlier sent ones, if there is any.
By comparing the current timestamp with the oldest deadline, we can now
make a much accurate decision about whether the corresponding peer is
healthy or not. And by setting last_rx_* to the timestamp we receiving
the reply, the lower bound when we can no longer hear a reply from the
corresponding connection is also much clear now.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit
477774ceee42641f6d6884536462f92567bfea11 )
Conflicts:
src/osd/OSD.cc (send_still_alive() has 1 less argument)
Jan Fajerski [Fri, 18 Oct 2019 12:02:01 +0000 (14:02 +0200)]
Merge pull request #30808 from jan--f/wip-42233-mimic
mimic: ceph-volume: VolumeGroups.filter shouldn't purge itself
Nathan Cutler [Fri, 18 Oct 2019 11:59:33 +0000 (13:59 +0200)]
Merge pull request #30936 from smithfarm/wip-42130-mimic
mimic: doc/ceph-fuse: mention -k option in ceph-fuse man page
Reviewed-by: Rishabh Dave <ridave@redhat.com>
Jan Fajerski [Fri, 18 Oct 2019 11:58:51 +0000 (13:58 +0200)]
Merge pull request #30806 from jan--f/wip-42235-mimic
mimic: ceph-volume: PVolumes.filter shouldn't purge itself
Nathan Cutler [Fri, 18 Oct 2019 10:31:14 +0000 (12:31 +0200)]
Revert "msg/async: do not trigger RESETSESSION from connect fault during connection phase"
This reverts commit
00b163564c6cafd4edf54d470cc708eab9dae10e .
Xie Xingguo and Ricardo Dias looked at this, and both agreed that the bug
was caused with the reduction of states of the V1 protocol during its
refactoring. In other words, the bug is not present in mimic.
Signed-off-by: Nathan Cutler <ncutler@suse.com>
Kefu Chai [Thu, 10 Oct 2019 02:11:27 +0000 (10:11 +0800)]
ceph.spec.in: provide python2-<modname>
to be consistent with other python2 packages, and their python3
counterparts
the `python_provide` macro is offered by `python-rpm-macros` package,
which is in turn required by python*-devel
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit
fb6290b1fab0bd8a73043f2d68210c010e2fb425 )
Kefu Chai [Thu, 10 Oct 2019 01:54:50 +0000 (09:54 +0800)]
ceph.spec.in: use python_provide macro
our python3 bindings are now named `python3-<modname>` after python3 is
now maintained by RHEL/CentOS instead EPEL. to help the users using
`python36-<modname>`, we should "Provide" `python36-<modname>`.
the `python_provide` macro is offered by `python-rpm-macros` package,
which is in turn required by python*-devel. and we do install
`python36-devel` in install-deps.sh, and install `python3-devel` in
ceph-*build/build/setup_rpm
see also
https://docs.fedoraproject.org/en-US/packaging-guidelines/Python/#_provides
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit
50b19e673d8200306c3e36d1abaec414a3d336b9 )
Enming Zhang [Wed, 10 Jul 2019 07:48:57 +0000 (00:48 -0700)]
rgw: fix checking index_doc_suffix when getting effective key
Currently, if the index_doc_suffix is empty which is leaded
by the IndexDocument field is not configurated or is set to
empty string during enabling bucket website function, the
rgw will crash when accessing the static website through the
S3Website enabled RGW instance.
In actually, we have add the necessary checking in the commit
355f392ad26631f44dac250296e96f421d86fb8f , but double checking
in here is better.
Signed-off-by: Enming Zhang <enming.zhang@umcloud.com>
(cherry picked from commit
c96f415dafe176b1b8d10ff9456d13fb76c79baa )
Conflicts:
src/rgw/rgw_rest_s3.cc
- ldpp_dout
Enming Zhang [Fri, 5 Jul 2019 15:09:22 +0000 (08:09 -0700)]
rgw: fix the bug of rgw not doing necessary checking to website configuration
Fixes: http://tracker.ceph.com/issues/40678
Signed-off-by: Enming Zhang <enming.zhang@umcloud.com>
(cherry picked from commit
355f392ad26631f44dac250296e96f421d86fb8f )
Conflicts:
src/rgw/rgw_rest_s3.cc
-
3275dffa45ae08d3818562d2d11a8d1c0afa326b is not being backported
Yuri Weinstein [Thu, 17 Oct 2019 16:45:12 +0000 (09:45 -0700)]
Merge pull request #29224 from smithfarm/wip-39223-mimic
mimic: mds: behind on trimming and [dentry] was purgeable but no longer is!
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Yuri Weinstein [Thu, 17 Oct 2019 16:44:39 +0000 (09:44 -0700)]
Merge pull request #29232 from smithfarm/wip-40439-mimic
mimic: mds: cannot switch mds state from standby-replay to active
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Yuri Weinstein [Thu, 17 Oct 2019 16:44:11 +0000 (09:44 -0700)]
Merge pull request #29479 from xiaoxichen/wip-41001
mimic: cephfs: client: unlink dentry for inode with llref=0
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Kefu Chai [Mon, 12 Aug 2019 02:12:28 +0000 (10:12 +0800)]
ceph-objectstore-tool: update-mon-db: do not fail if incmap is missing
there is chance that we could use an OSD which does not have incmap of a
certain epoch for rebuilding the monstore. and since OSD does not read
and store the incmap if the MOSDMap message already has the fullmap of
that fullmap, and if an OSD does not have previous fullmap, monitor
will just send it the fullmao. so it's not unusual that an OSD has
a fullmap of some epoch without corresponding incmap.
Fixes: https://tracker.ceph.com/issues/41177
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit
2e2414b3df97b22ccc54500830c24b28597ce75f )
Yuri Weinstein [Thu, 17 Oct 2019 16:04:52 +0000 (09:04 -0700)]
Merge pull request #29218 from smithfarm/wip-38709-mimic
mimic: tests: kclient unmount hangs after file system goes down
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Yuri Weinstein [Thu, 17 Oct 2019 16:04:29 +0000 (09:04 -0700)]
Merge pull request #29220 from smithfarm/wip-39210-mimic
mimic: mds: mds_cap_revoke_eviction_timeout is not used to initialize Server::cap_revoke_eviction_timeout
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Yuri Weinstein [Thu, 17 Oct 2019 16:03:59 +0000 (09:03 -0700)]
Merge pull request #29222 from smithfarm/wip-39212-mimic
mimic: cephfs: MDSTableServer.cc: 83: FAILED assert(version == tid)
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Yan, Zheng <zyan@redhat.com>
Yuri Weinstein [Thu, 17 Oct 2019 15:58:00 +0000 (08:58 -0700)]
Merge pull request #29223 from smithfarm/wip-39215-mimic
mimic: mds: there is an assertion when calling Beacon::shutdown()
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Yuri Weinstein [Thu, 17 Oct 2019 15:57:28 +0000 (08:57 -0700)]
Merge pull request #29228 from smithfarm/wip-40219-mimic
mimic: tests: cephfs: TestMisc.test_evict_client fails
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Yuri Weinstein [Thu, 17 Oct 2019 15:56:58 +0000 (08:56 -0700)]
Merge pull request #29230 from smithfarm/wip-40437-mimic
mimic: cephfs: getattr on snap inode stuck
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Yuri Weinstein [Thu, 17 Oct 2019 15:50:42 +0000 (08:50 -0700)]
Merge pull request #30796 from dillaman/wip-36122-mimic
mimic: librbd: properly handle potential object map failures
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Yuri Weinstein [Thu, 17 Oct 2019 15:49:58 +0000 (08:49 -0700)]
Merge pull request #30828 from dillaman/wip-41882-mimic
mimic: rbd-mirror: cannot restore deferred deletion mirrored images
Reviewed-by: Mykola Golub <mgolub@mirantis.com>
Abhishek Lekshmanan [Wed, 7 Aug 2019 15:09:32 +0000 (17:09 +0200)]
rgw: asio: check the remote endpoint before processing requests
`socket.remote_endpoint()` can throw exceptions corresponding to errors in the
`getpeername` syscall, make sure these are handled.
Fixes: CVE-2019-10222, https://tracker.ceph.com/issues/40018
Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
(cherry picked from commit
caa653196856ecdf50519a9a33195d5c4e3372af )
Conflicts:
src/rgw/rgw_asio_frontend.cc
conflicts due to optional-yield-ctx changes in master
Yuri Weinstein [Wed, 16 Oct 2019 23:26:47 +0000 (16:26 -0700)]
Merge pull request #30213 from smithfarm/wip-41449-mimic
mimic: core: mon: C_AckMarkedDown has not handled the Callback Arguments
Reviewed-by: David Zafman <dzafman@redhat.com>
Yuri Weinstein [Wed, 16 Oct 2019 23:22:05 +0000 (16:22 -0700)]
Merge pull request #30150 from neha-ojha/wip-40769-mimic
mimic: bluestore: common/options: Set concurrent bluestore rocksdb compactions to 2
Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Rishabh Dave [Thu, 7 Mar 2019 05:00:09 +0000 (10:30 +0530)]
mds: check earlier if directories are already exported
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit
46fb90734f371b5467d81cae25c06f8a487a3041 )
Conflicts:
src/mds/MDCache.cc
- g_conf-> instead of g_conf()->
Rishabh Dave [Wed, 30 Jan 2019 09:16:36 +0000 (14:46 +0530)]
mds: dont print auth trees if they are too many
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit
13c4152c8b1756a871f817d4b503396b6c9cc81d )
Conflicts:
src/mds/MDBalancer.cc
- g_conf
- authsubs in method MDBalancer::handle_export_pins() is redeclared from set
to vector in master branch. Removed vector declaration as per the luminous
backport.
Rishabh Dave [Mon, 21 Jan 2019 10:06:57 +0000 (15:36 +0530)]
mds: dont print subtrees if they are too many/big
Also, add an argument to force print subtrees and let
MDBalancer::tick() always print subtrees.
Fixes: https://tracker.ceph.com/issues/37726
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit
50d28ec4f842670fb96ab3b0d2b37a9c2e282236 )
Conflicts:
src/mds/MDCache.cc
- g_conf
Yuri Weinstein [Tue, 15 Oct 2019 20:06:01 +0000 (13:06 -0700)]
Merge pull request #30069 from smithfarm/wip-40124-mimic
mimic: qa/rgw: don't use ceph-ansible in s3a-hadoop suite
Reviewed-by: Casey Bodley <cbodley@redhat.com>