]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
5 years agoqa/suites/fs: add python3-cephfs to packages 31285/head
Kefu Chai [Fri, 3 Aug 2018 09:27:20 +0000 (17:27 +0800)]
qa/suites/fs: add python3-cephfs to packages

the default set of packages to install is in
$suite/qa/packages/packages.yaml . see get_package_list() in
teuthology/teuthology/task/install/__init__.py for how we prepare a
package list for install task.

for running python3 tests in
fs/basic_functional/tasks/volume-client, we need to install
python3-cephfs. please note that,
_package_override() in teuthology/teutholoy/task/install/rpm.py will
take care of the different naming on centos/rhel, where the python3
packages are named python34-*.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 9754b3769bf07af1617ad48376769df016a58d9d)

Conflicts:
qa/cephfs/begin.yaml

5 years agoqa: do not install python3 packages in task.install
Kefu Chai [Fri, 3 Aug 2018 09:02:49 +0000 (17:02 +0800)]
qa: do not install python3 packages in task.install

This reverts commit c1efd59f618e24cf060d564ac0f21d5b0b57fd4a

task.install.rpm installs packages listed in
$suites/qa/packages/packages.yaml, the packge list applies to the
upgrade tests also. but we don't have python3 bindings packages in jewel
-- they were introduced in kraken.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 7e5c85b604c8f9045152d37f54fae4246ea82737)

Conflicts:
qa/packages/packages.yaml

5 years agoMerge pull request #31211 from sebastian-philipp/mimic-ceph-volume-device_id
Jan Fajerski [Thu, 31 Oct 2019 10:48:09 +0000 (11:48 +0100)]
Merge pull request #31211 from sebastian-philipp/mimic-ceph-volume-device_id

mimic: ceph-volume: add Ceph's device id to inventory

5 years agoMerge pull request #31258 from jan--f/wip-41288-mimic
Nathan Cutler [Thu, 31 Oct 2019 10:01:43 +0000 (11:01 +0100)]
Merge pull request #31258 from jan--f/wip-41288-mimic

mimic: doc: update bluestore cache settings and clarify data fraction

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
5 years agoMerge pull request #31254 from alfredodeza/wip-rm42292-mimic
Yuri Weinstein [Thu, 31 Oct 2019 00:22:01 +0000 (17:22 -0700)]
Merge pull request #31254 from alfredodeza/wip-rm42292-mimic

mimic: qa/ceph-disk: use a Python2.7 compatible version of pytest

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
Reviewed-by: Nathan Cutler <ncutler@suse.com>
5 years agoMerge pull request #31227 from jan--f/c-v-missing-mimic-lvm-backports
Alfredo Deza [Wed, 30 Oct 2019 19:12:35 +0000 (15:12 -0400)]
Merge pull request #31227 from jan--f/c-v-missing-mimic-lvm-backports

Add some missing backports to mimic

Reviewed-by: Alfredo Deza <adeza@redhat.com>
5 years agodoc: update bluestore cache settings and clarify data fraction 31258/head
Jan Fajerski [Mon, 29 Apr 2019 12:52:27 +0000 (14:52 +0200)]
doc: update bluestore cache settings and clarify data fraction

Fixes: http://tracker.ceph.com/issues/39522
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit 9d8336a7f418fe2bb11361dd74a214403b1e5be7)

5 years agoqa/ceph-disk: use a Python2.7 compatible version of pytest 31254/head
Alfredo Deza [Fri, 25 Oct 2019 15:49:54 +0000 (11:49 -0400)]
qa/ceph-disk: use a Python2.7 compatible version of pytest

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit 149ce7af588b9c052e41d687e722fee9b7255d7c)

5 years agoMerge pull request #28452 from thmour/mimic_test
Yuri Weinstein [Tue, 29 Oct 2019 19:36:53 +0000 (12:36 -0700)]
Merge pull request #28452 from thmour/mimic_test

mimic: mds: stopping MDS with a large cache (40+GB) causes it to miss heartbeats

Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 years agoMerge pull request #30950 from sidharthanup/mds-evict-duplicate-mimic
Yuri Weinstein [Tue, 29 Oct 2019 19:35:58 +0000 (12:35 -0700)]
Merge pull request #30950 from sidharthanup/mds-evict-duplicate-mimic

mimic: mds: Fix duplicate client entries in eviction list

Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 years agoceph-volume: update volume's tags structure when setting tags 31227/head
Mohamad Gebai [Tue, 2 Apr 2019 10:45:02 +0000 (06:45 -0400)]
ceph-volume: update volume's tags structure when setting tags

Signed-off-by: Mohamad Gebai <mgebai@suse.com>
(cherry picked from commit 4a1198deffb0baf647a6a31e03cbfe98f011ff14)

5 years agoceph-volume: add clear_tag function for LVs
Mohamad Gebai [Sun, 31 Mar 2019 17:06:23 +0000 (13:06 -0400)]
ceph-volume: add clear_tag function for LVs

Signed-off-by: Mohamad Gebai <mgebai@suse.com>
(cherry picked from commit 776d485af8b6225fd4059952df36e40ef0ad12b4)

5 years agoceph-volume: add reduce_vg function
Mohamad Gebai [Sun, 31 Mar 2019 17:04:40 +0000 (13:04 -0400)]
ceph-volume: add reduce_vg function

Signed-off-by: Mohamad Gebai <mgebai@suse.com>
(cherry picked from commit ce0184b5d7f24f2b3b6a9491e0f3c1c847b8c0e7)

5 years agoceph-volume: look for hidden partitions when populating lvs
Mohamad Gebai [Sun, 31 Mar 2019 17:04:10 +0000 (13:04 -0400)]
ceph-volume: look for hidden partitions when populating lvs

Signed-off-by: Mohamad Gebai <mgebai@suse.com>
(cherry picked from commit 463091e46ba4032f1b8d90a6770fd7e2d3277a74)

5 years agoceph-volume: set a 1G extent size when creating vgs
Andrew Schoen [Thu, 29 Nov 2018 19:44:07 +0000 (13:44 -0600)]
ceph-volume: set a 1G extent size when creating vgs

This allows us to create larger lvs than the default of 4m
and is easier to reason about when sizing the lvs as everythign is
reported as GBs.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 4a1b97efc87f3df15a39a76de074b4791f3528ca)

5 years agoMerge pull request #30225 from dzafman/wip-network-mimic
Yuri Weinstein [Tue, 29 Oct 2019 16:35:10 +0000 (09:35 -0700)]
Merge pull request #30225 from dzafman/wip-network-mimic

mimic: core: Health warnings on long network ping times

Reviewed-by: Neha Ojha <nojha@redhat.com>
5 years agoceph-volume: add Ceph's device id to inventory 31211/head
Sebastian Wagner [Fri, 18 Oct 2019 11:59:44 +0000 (13:59 +0200)]
ceph-volume: add Ceph's device id to inventory

This will benefit the orchestrator and dashboard to show a unified view of devices with SMART data

Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
(cherry picked from commit e70d6041c1a093ed5c2b77abe17e1ede533d9659)

5 years agoosd/OSD: auto mark heartbeat sessions as stale and tear them down 30225/head
xie xingguo [Wed, 26 Jun 2019 06:24:08 +0000 (14:24 +0800)]
osd/OSD: auto mark heartbeat sessions as stale and tear them down

The primary benefit is that the OSD doesn't need to keep a flood of
blocked heartbeat messages around in memory.
This prevents OSDs from accumulating heartbeat messages due to a
broken switch and then exhausting the whole node's memory:

Jun 11 04:19:26 host-192-168-9-12 kernel: [409881.137077] Out of memory:
Kill process 1471476 (ceph-osd) score 47 or sacrifice child
Jun 11 04:19:26 host-192-168-9-12 kernel: [409881.146054] Killed process
1471476 (ceph-osd) total-vm:4822548kB, anon-rss:3097860kB,
file-rss:2556kB, shmem-rss:0kB

Fixes: http://tracker.ceph.com/issues/40586
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 6cc90f363b8096d2d5fad30e57426d0cea9e3478)

Conflicts:
src/osd/OSD.cc (no boot_finisher.stop() and no lock_guard)
src/osd/OSD.h (trivial)

Fixed get_val() call in reset_heartbeat_peers()

5 years agomds: handle negative decay counter 28452/head
Patrick Donnelly [Sat, 2 Feb 2019 00:00:13 +0000 (16:00 -0800)]
mds: handle negative decay counter

Problem only exists in Luminous/Mimic.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry-picked from commit 5f23246)

5 years agotest/mds: fix Session cons call
Patrick Donnelly [Fri, 1 Feb 2019 18:07:58 +0000 (10:07 -0800)]
test/mds: fix Session cons call

Problem did not exist in master.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry-picked from commit 5ed5c51)

Conflicts:
src/test/mds/TestSessionFilter.cc

5 years agomds: simplify recall warnings
Patrick Donnelly [Mon, 28 Jan 2019 23:48:38 +0000 (15:48 -0800)]
mds: simplify recall warnings

Instead of a timeout and complicated decisions about whether the client is
releasing caps in an expeditious fashion, just use a DecayCounter that tracks
the number of caps we've recalled. This counter is decremented whenever the
client releases caps. If the counter passes a threshold, then we raise the
warning.

Similar reworking is done for the steady-state recall of client caps. Another
release DecayCounter is added so we can tell when the client is not releasing
any more caps.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit c0b3a11)

Conflicts:
PendingReleaseNotes
src/mds/Beacon.cc
src/mds/Server.cc
src/mds/SessionMap.cc
src/mds/SessionMap.h

5 years agomds: add extra details for cache drop output
Patrick Donnelly [Fri, 25 Jan 2019 23:59:13 +0000 (15:59 -0800)]
mds: add extra details for cache drop output

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 3bc093f)

Conflicts:
src/mds/Server.cc

5 years agoqa: test mds_max_caps_per_client conf
Patrick Donnelly [Fri, 25 Jan 2019 20:13:50 +0000 (12:13 -0800)]
qa: test mds_max_caps_per_client conf

That the MDS will not let a client sit above mds_max_caps_per_client caps.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 30aaa88)

5 years agomds: limit maximum number of caps held by session
Patrick Donnelly [Thu, 24 Jan 2019 22:23:08 +0000 (14:23 -0800)]
mds: limit maximum number of caps held by session

This is to prevent unsustainable situations where a client has so many
outstanding caps that a linear traversal/operation on the session's caps takes
unacceptable amounts of time.

Fixes: http://tracker.ceph.com/issues/38022
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 48ca097)

Conflicts:
PendingReleaseNotes
src/mds/Server.cc

5 years agomds: adapt drop cache for incremental recall
Patrick Donnelly [Thu, 24 Jan 2019 22:22:42 +0000 (14:22 -0800)]
mds: adapt drop cache for incremental recall

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 7244cae)

5 years agomds: recall caps incrementally
Patrick Donnelly [Wed, 23 Jan 2019 14:41:55 +0000 (06:41 -0800)]
mds: recall caps incrementally

As with trimming, use DecayCounters to throttle the number of caps we recall,
both globally and per-session.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit ef46216)

Conflicts:
PendingReleaseNotes
qa/suites/fs/bugs/client_trim_caps/tasks/trim-i22073.yaml
src/mds/Beacon.cc
src/mds/MDSDaemon.cc
src/mds/Server.cc
src/mds/Server.h
src/mds/SessionMap.cc
src/mds/SessionMap.h

5 years agomds: cleanup Session init
Patrick Donnelly [Mon, 21 Jan 2019 18:57:45 +0000 (10:57 -0800)]
mds: cleanup Session init

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit ce153b8)

Conflicts:
src/mds/SessionMap.cc
src/mds/SessionMap.h

5 years agomds: adapt drop cache for incremental trim
Patrick Donnelly [Sun, 20 Jan 2019 04:40:11 +0000 (20:40 -0800)]
mds: adapt drop cache for incremental trim

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit b750b3b)

5 years agomds: add throttle for trimming MDCache
Patrick Donnelly [Sat, 19 Jan 2019 00:18:59 +0000 (16:18 -0800)]
mds: add throttle for trimming MDCache

This is necessary when the MDS cache size decreases by a significant amount.
For example, when stopping a large MDS or when the operator makes a large cache
size reduction.

Fixes: http://tracker.ceph.com/issues/37723
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 7bf2f31)

Conflicts:
PendingReleaseNotes
src/mds/MDCache.cc
src/mds/MDCache.h

5 years agomds: cleanup SessionMap init
Patrick Donnelly [Fri, 18 Jan 2019 23:43:48 +0000 (15:43 -0800)]
mds: cleanup SessionMap init

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 69efdaf)

Conflicts:
    src/mds/SessionMap.h

5 years agoMerge pull request #28585 from ukernel/mimic-40327
Yuri Weinstein [Wed, 23 Oct 2019 15:32:06 +0000 (08:32 -0700)]
Merge pull request #28585 from ukernel/mimic-40327

mimic: mds: change how mds revoke stale caps

Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 years agoMerge pull request #30841 from smithfarm/wip-42263-mimic
Yuri Weinstein [Wed, 23 Oct 2019 15:31:21 +0000 (08:31 -0700)]
Merge pull request #30841 from smithfarm/wip-42263-mimic

mimic: tests: do not take ceph.conf.template from ceph/teuthology.git

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
5 years agoMerge pull request #30918 from smithfarm/wip-42122-mimic
Yuri Weinstein [Wed, 23 Oct 2019 15:28:47 +0000 (08:28 -0700)]
Merge pull request #30918 from smithfarm/wip-42122-mimic

mimic: cephfs: client: add procession of SEEK_HOLE and SEEK_DATA in lseek.

Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 years agoMerge pull request #30979 from smithfarm/wip-41464-mimic
Yuri Weinstein [Wed, 23 Oct 2019 15:28:14 +0000 (08:28 -0700)]
Merge pull request #30979 from smithfarm/wip-41464-mimic

mimic: tools: ceph-objectstore-tool: update-mon-db: do not fail if incmap is missing

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
5 years agoMerge pull request #31017 from smithfarm/wip-40896-mimic-revert
Yuri Weinstein [Wed, 23 Oct 2019 15:27:29 +0000 (08:27 -0700)]
Merge pull request #31017 from smithfarm/wip-40896-mimic-revert

mimic: cephfs: Revert "ceph_volume_client: convert string to bytes object"

Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 years agoMerge pull request #29219 from smithfarm/wip-38875-mimic
Yuri Weinstein [Wed, 23 Oct 2019 15:25:53 +0000 (08:25 -0700)]
Merge pull request #29219 from smithfarm/wip-38875-mimic

mimic: mds: high debug logging with many subtrees is slow

Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 years agoMerge pull request #30932 from smithfarm/wip-42034-mimic
Yuri Weinstein [Wed, 23 Oct 2019 15:25:24 +0000 (08:25 -0700)]
Merge pull request #30932 from smithfarm/wip-42034-mimic

mimic: cephfs: client: EINVAL may be returned when offset is 0.

Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 years agoMerge pull request #30933 from smithfarm/wip-42038-mimic
Yuri Weinstein [Wed, 23 Oct 2019 15:24:59 +0000 (08:24 -0700)]
Merge pull request #30933 from smithfarm/wip-42038-mimic

mimic: cephfs: client: _readdir_cache_cb() may use the readdir_cache already clear

Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 years agoMerge pull request #31090 from smithfarm/wip-42416-mimic
Nathan Cutler [Wed, 23 Oct 2019 15:10:09 +0000 (17:10 +0200)]
Merge pull request #31090 from smithfarm/wip-42416-mimic

mimic: doc/rbd: s/guess/xml/ for codeblock lexer

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
5 years agodoc/rbd: s/guess/xml/ for codeblock lexer 31090/head
Kefu Chai [Wed, 16 Oct 2019 04:34:19 +0000 (12:34 +0800)]
doc/rbd: s/guess/xml/ for codeblock lexer

this change silences the warning of

```
doc/rbd/qemu-rbd.rst:174: WARNING: Pygments lexer name 'guess' is not
known
```

see http://pygments.org/docs/lexers/, we should use "xml" for XML .

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit df226da996e468d2707b08eb012d54b4e37ffdc6)

5 years agoMerge pull request #30775 from smithfarm/wip-41979-mimic
Yuri Weinstein [Tue, 22 Oct 2019 18:41:52 +0000 (11:41 -0700)]
Merge pull request #30775 from smithfarm/wip-41979-mimic

mimic: rgw: fix list versions starts with version_id=null

Reviewed-by: Casey Bodley <cbodley@redhat.com>
5 years agoMerge pull request #30868 from smithfarm/wip-41324-mimic
Yuri Weinstein [Tue, 22 Oct 2019 18:41:26 +0000 (11:41 -0700)]
Merge pull request #30868 from smithfarm/wip-41324-mimic

mimic: rgw: datalog/mdlog trim commands loop until done

Reviewed-by: Casey Bodley <cbodley@redhat.com>
5 years agoMerge pull request #30980 from smithfarm/wip-41496-mimic
Yuri Weinstein [Tue, 22 Oct 2019 18:40:59 +0000 (11:40 -0700)]
Merge pull request #30980 from smithfarm/wip-41496-mimic

mimic: rgw: fix the bug of rgw not doing necessary checking to website configuration

Reviewed-by: Casey Bodley <cbodley@redhat.com>
5 years agoMerge pull request #30891 from smithfarm/wip-41715-mimic
Yuri Weinstein [Tue, 22 Oct 2019 15:05:57 +0000 (08:05 -0700)]
Merge pull request #30891 from smithfarm/wip-41715-mimic

mimic: rgw: fix refcount tags to match and update object's idtag

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
5 years agoMerge pull request #30977 from theanalyst/wip-41570-mimic
Yuri Weinstein [Tue, 22 Oct 2019 15:05:09 +0000 (08:05 -0700)]
Merge pull request #30977 from theanalyst/wip-41570-mimic

mimic: rgw: asio: check the remote endpoint before processing requests

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
5 years agoqa/tasks/ceph.conf: do not warn on TOO_FEW_OSDS 30841/head
Sage Weil [Fri, 10 May 2019 19:45:22 +0000 (14:45 -0500)]
qa/tasks/ceph.conf: do not warn on TOO_FEW_OSDS

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 0483c1c3e7ffdfa6a6f65c5ef000c45d2f096428)

5 years agoMerge pull request #30713 from smithfarm/wip-40258-mimic
Yuri Weinstein [Mon, 21 Oct 2019 23:53:42 +0000 (16:53 -0700)]
Merge pull request #30713 from smithfarm/wip-40258-mimic

mimic: cmake: detect armv8 crc and crypto feature using CHECK_C_COMPILER_FLAG

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
5 years agoMerge pull request #30893 from smithfarm/wip-41964-mimic
Yuri Weinstein [Mon, 21 Oct 2019 23:52:41 +0000 (16:52 -0700)]
Merge pull request #30893 from smithfarm/wip-41964-mimic

mimic: tools/rados: list objects in a pg

Reviewed-by: Vikhyat Umrao <vikhyat@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
5 years agoMerge pull request #30898 from smithfarm/wip-42128-mimic
Yuri Weinstein [Mon, 21 Oct 2019 23:51:55 +0000 (16:51 -0700)]
Merge pull request #30898 from smithfarm/wip-42128-mimic

mimic: osd/OSDMap: do not trust partially simplified pg_upmap_item

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
5 years agoMerge pull request #30903 from smithfarm/wip-42154-mimic
Yuri Weinstein [Mon, 21 Oct 2019 23:51:30 +0000 (16:51 -0700)]
Merge pull request #30903 from smithfarm/wip-42154-mimic

mimic: mon/OSDMonitor: trim not-longer-exist failure reporters

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
5 years agoMerge pull request #30924 from vumrao/wip-vumrao-42240
Yuri Weinstein [Mon, 21 Oct 2019 23:50:40 +0000 (16:50 -0700)]
Merge pull request #30924 from vumrao/wip-vumrao-42240

mimic: osd/PG: Add PG to large omap log message

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
5 years agoMerge pull request #30846 from wido/mimic-42116
Yuri Weinstein [Mon, 21 Oct 2019 23:48:02 +0000 (16:48 -0700)]
Merge pull request #30846 from wido/mimic-42116

mimic: mgr/telemetry: Ignore crashes in report when module not enabled

Reviewed-by: Sage Weil <sage@redhat.com>
5 years agoMerge pull request #30895 from smithfarm/wip-42036-mimic
Yuri Weinstein [Mon, 21 Oct 2019 23:47:22 +0000 (16:47 -0700)]
Merge pull request #30895 from smithfarm/wip-42036-mimic

mimic: osd/PeeringState: recover_got - add special handler for empty log

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Reviewed-by: Neha Ojha <nojha@redhat.com>
5 years agoMerge pull request #30901 from smithfarm/wip-42137-mimic
Yuri Weinstein [Mon, 21 Oct 2019 23:46:52 +0000 (16:46 -0700)]
Merge pull request #30901 from smithfarm/wip-42137-mimic

mimic: osd: Remove unused osdmap flags full, nearfull from output

Reviewed-by: David Zafman <dzafman@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
5 years agoMerge pull request #30916 from smithfarm/wip-41457-mimic
Yuri Weinstein [Mon, 21 Oct 2019 23:45:59 +0000 (16:45 -0700)]
Merge pull request #30916 from smithfarm/wip-41457-mimic

mimic: osd: merge replica log on primary need according to replica log's crt

Reviewed-by: Neha Ojha <nojha@redhat.com>
5 years agoMerge pull request #30982 from tchaikov/wip-mimic-42362
Yuri Weinstein [Mon, 21 Oct 2019 23:45:30 +0000 (16:45 -0700)]
Merge pull request #30982 from tchaikov/wip-mimic-42362

mimic: build/ops: python3-cephfs should provide python36-cephfs

Reviewed-by: Nathan Cutler <ncutler@suse.com>
5 years agoMerge pull request #30991 from smithfarm/wip-37520-mimic-revert
Yuri Weinstein [Mon, 21 Oct 2019 23:44:32 +0000 (16:44 -0700)]
Merge pull request #30991 from smithfarm/wip-37520-mimic-revert

mimic: msg: Revert "msg/async: do not trigger RESETSESSION from connect fault during connection phase"

Reviewed-by: Ricardo Dias <rdias@suse.com>
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
5 years agoRevert "ceph_volume_client: make UTF-8 encoding explicit" 31017/head
Nathan Cutler [Mon, 21 Oct 2019 12:02:08 +0000 (14:02 +0200)]
Revert "ceph_volume_client: make UTF-8 encoding explicit"

This reverts commit ddb8cfa072cc19fb6cb61128b4fe4b8ffe1a1742.

Signed-off-by: Nathan Cutler <ncutler@suse.com>
5 years agoRevert "pybind/ceph_volume_client: allow volume_client to"
Nathan Cutler [Mon, 21 Oct 2019 12:02:03 +0000 (14:02 +0200)]
Revert "pybind/ceph_volume_client: allow volume_client to"

This reverts commit 1ab18fcf056ef8e4fcb9ed27e64ee4f8a75866dc.

Signed-off-by: Nathan Cutler <ncutler@suse.com>
5 years agoRevert "ceph_volume_client: convert string to bytes object"
Nathan Cutler [Mon, 21 Oct 2019 12:01:55 +0000 (14:01 +0200)]
Revert "ceph_volume_client: convert string to bytes object"

This reverts commit e081e97190b58d3a09377275f6d484bf9020fe51.

Signed-off-by: Nathan Cutler <ncutler@suse.com>
5 years agoRevert "ceph_volume_client: don't convert None to str object"
Nathan Cutler [Mon, 21 Oct 2019 12:01:48 +0000 (14:01 +0200)]
Revert "ceph_volume_client: don't convert None to str object"

This reverts commit e085982def42738d181240c7b6e97b30f6bc0cf8.

Signed-off-by: Nathan Cutler <ncutler@suse.com>
5 years agomds: check last laggy before marking unresponsive client stale 28585/head
Yan, Zheng [Wed, 19 Jun 2019 06:39:55 +0000 (14:39 +0800)]
mds: check last laggy before marking unresponsive client stale

Current mds may evict unresponsive client without going through session
stale. So we need to adjust the last laggy check.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit e5cc6f52feb64fa1c7ba1ee2b304cd6588f16e7a)

 Conflicts:
src/mds/Server.cc

5 years agomds: remove the code that skip evicting the only client
Yan, Zheng [Wed, 19 Jun 2019 03:42:05 +0000 (11:42 +0800)]
mds: remove the code that skip evicting the only client

There is already logic that defer marking unresponsive client stale.
No reason to defer evicting the only stale client.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit cd29206974427a4f6ab410b1482bbd8ebfb55fbd)

 Conflicts:
qa/tasks/cephfs/test_misc.py

5 years agoqa/cephfs: update tests for stale session handling
Yan, Zheng [Tue, 5 Mar 2019 09:40:08 +0000 (17:40 +0800)]
qa/cephfs: update tests for stale session handling

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit 1c8be588e32f47ca712561711ad1ffdddc54b330)

 Conflicts:
qa/tasks/cephfs/test_client_recovery.py

5 years agomds: change how mds revoke stale caps
Yan, Zheng [Wed, 27 Feb 2019 12:51:38 +0000 (20:51 +0800)]
mds: change how mds revoke stale caps

- Only revokes conflicting caps from stale client.
- If stale client holds conflicting CEPH_CAP_ANY_WR,
  blacklist and kill it.

Fixes: https://tracker.ceph.com/issues/38326
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit dcae1ea2d30398f7b6493a74b482e964a21fcfeb)

 Conflicts:
src/mds/CInode.cc
src/mds/Capability.cc
src/mds/Locker.cc
src/mds/MDSRank.h
src/mds/Server.cc

5 years agomds: don't mark unresponsive sessions holding no caps stale
Yan, Zheng [Mon, 17 Jun 2019 04:58:58 +0000 (12:58 +0800)]
mds: don't mark unresponsive sessions holding no caps stale

When an unresponsive MDS session holds no caps, do not mark it stale
even after session_timeout; at session_autoclose, evict it directly.

Fixes: http://tracker.ceph.com/issues/17854
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 98af31d10f362c05ea8ed57495973b08599431e7)

 Conflicts:
src/mds/Server.cc

5 years agomds: optimize resuming stale caps
Yan, Zheng [Mon, 10 Dec 2018 03:37:32 +0000 (11:37 +0800)]
mds: optimize resuming stale caps

If client doesn't want any cap, there is no need to re-issue stale
caps.

A special case is that client wants some caps, but skipped updating
'wanted'. For this case, client needs to update 'wanted' when stale
session get renewed.

Fixes: http://tracker.ceph.com/issues/38043
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit e824b3d2024db36789fdf579f0af9bf3bbe55d51)

 Conflicts:
src/client/Client.cc
src/mds/cephfs_features.h

5 years agoclient: avoid unnecessary wakeup when handling RENEWCAPS
Yan, Zheng [Thu, 6 Dec 2018 09:22:25 +0000 (17:22 +0800)]
client: avoid unnecessary wakeup when handling RENEWCAPS

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit c744bc1673ca0b4e50f41516fdf49c8560db073a)

5 years agoclient: don't wakeup cap waiters twice when mds recovered
Yan, Zheng [Thu, 6 Dec 2018 09:16:49 +0000 (17:16 +0800)]
client: don't wakeup cap waiters twice when mds recovered

Both kick_maxsize_requests() and wake_inode_waiters() wake up cap
waiters

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit 5993a93e606d23ef311831863444294d7c64cc04)

 Conflicts:
src/client/Client.cc
src/client/Client.h

5 years agoclient: set cap->wanted when adding new cap
Yan, Zheng [Thu, 6 Dec 2018 07:30:44 +0000 (15:30 +0800)]
client: set cap->wanted when adding new cap

This avoids unnecessary cap message if cap is added by open/create
request reply.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit 314660a46a8a1af97f70b2bac05b2f6fa5d23bc4)

 Conflicts:
src/client/Client.h

5 years agomds: optimize revoking stale caps
Yan, Zheng [Fri, 23 Nov 2018 08:19:52 +0000 (16:19 +0800)]
mds: optimize revoking stale caps

For caps that are not being revoked and don't have writeable range
and don't want exclusive caps or file read/write. there is no need
to call Locker::revoke_stale_caps(Capability*). Because these caps
don't need recover and don't affect eval_gather()/try_eval().

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit cb62030f0468fc04735c1b4cff73da779cb11ad8)

 Conflicts:
src/mds/CInode.cc
src/mds/Capability.h
src/mds/Locker.cc
src/mds/Migrator.cc

5 years agomds: put notable caps at the front of session's caps list
Yan, Zheng [Thu, 22 Nov 2018 09:28:15 +0000 (17:28 +0800)]
mds: put notable caps at the front of session's caps list

Notable Capabilities are ones that are being revoked, ones that
have writeable ranges and ones that want exclusive caps or want
file read/write.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit cb6e7184458e64720ad83c266e5e393a80c32697)

5 years agomds: track if client has writeable range in Capability
Yan, Zheng [Wed, 21 Nov 2018 12:22:25 +0000 (20:22 +0800)]
mds: track if client has writeable range in Capability

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit 370ae1cb3e5dc07867d80e998082bc514e8fccfd)

 Conflicts:
src/mds/Locker.cc
src/mds/MDCache.h
src/mds/Server.cc

5 years agomds: add session pointer to Capability
Yan, Zheng [Mon, 26 Nov 2018 01:44:56 +0000 (09:44 +0800)]
mds: add session pointer to Capability

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit 7c628472a86c6acebb20f0a2504744b10f250587)

 Conflicts:
src/mds/CInode.cc
src/mds/Capability.h
src/mds/Locker.cc

5 years agoclient: sync 'retain caps' logical from kernel client
Yan, Zheng [Thu, 22 Nov 2018 07:55:12 +0000 (15:55 +0800)]
client: sync 'retain caps' logical from kernel client

The main change is keeping CEPH_CAP_ANY_RD for unused file inodes

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit 29034396398b1706db625e1abf2e6682be7130a5)

5 years agoclient: skip updating 'wanted' caps if caps are already issued
Yan, Zheng [Thu, 22 Nov 2018 07:02:36 +0000 (15:02 +0800)]
client: skip updating 'wanted' caps if caps are already issued

When reading cached inode that already has Fscr caps, this can avoid
two cap messages (one updats 'wanted' caps, one clears 'wanted' caps).

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit d20b260ecf3f323c87ad1e6865f87e2381444546)

5 years agoosd mon mgr: Changes for rebase and correction for this branch
David Zafman [Fri, 26 Jul 2019 05:23:21 +0000 (22:23 -0700)]
osd mon mgr: Changes for rebase and correction for this branch

Fix use of asok_command() which doesn't do try/catch
Need unregister_command() since unregister_commands() doesn't exist here
Use Mutex::locker since lock_guard() isn't available
Use new g_conf which isn't g_conf() anymore
cct->_conf is a pointer now
Use ceph_abort() because cct isn't available for ceph_abort_msg()

Signed-off-by: David Zafman <dzafman@redhat.com>
5 years agotest: Ignore OSD_SLOW_PING_TIME* if injecting socket failures
David Zafman [Thu, 3 Oct 2019 16:09:10 +0000 (09:09 -0700)]
test: Ignore OSD_SLOW_PING_TIME* if injecting socket failures

Fixes: https://tracker.ceph.com/issues/41743
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit ded58ef91d6c8a68de49fa2c6b6e01636515c59b)

Conflicts: 3 yamls don't exist in Mimic

5 years agotest: Allow fractional milliseconds to make test possible
David Zafman [Fri, 6 Sep 2019 18:20:10 +0000 (11:20 -0700)]
test: Allow fractional milliseconds to make test possible

Fixes: https://tracker.ceph.com/issues/41689
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 6d2e4cb109caff8dae5e5e18563b6305131b488b)

5 years agodoc: Document network performance monitoring
David Zafman [Wed, 4 Sep 2019 18:38:09 +0000 (18:38 +0000)]
doc: Document network performance monitoring

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 71015b94abdf669695754a598a05a4a1c5d46f83)

Conflicts:
doc/rados/operations/monitoring.rst (trivial)

5 years agoosd doc mon mgr: To milliseconds for config value, user input and threshold out
David Zafman [Wed, 4 Sep 2019 17:13:32 +0000 (17:13 +0000)]
osd doc mon mgr: To milliseconds for config value, user input and threshold out

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 5f83a6158b29944cf8f5a069c50edba3e172cdcc)

Conflicts:
src/common/options.cc (trivial)

5 years agoosd mon mgr: Convert all network ping time output to milliseconds
David Zafman [Tue, 6 Aug 2019 03:57:48 +0000 (20:57 -0700)]
osd mon mgr: Convert all network ping time output to milliseconds

To output milliseconds (usec / 1000), treat as fixed point integers

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 9d02e5d39d7b5e2806a5d98bdde24f4584e70528)

Conflicts:
src/mon/PGMap.cc (trivial)

5 years agocommon: Add support routines to generate strings for fixed point
David Zafman [Fri, 9 Aug 2019 01:06:43 +0000 (18:06 -0700)]
common: Add support routines to generate strings for fixed point

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 8ac1562b4988fc3d52f92f15eb58075de0bcf27e)

Conflicts:
src/common/Formatter.h (trivial)

5 years agotest: Add basic test for network ping tracking
David Zafman [Sat, 13 Jul 2019 02:35:04 +0000 (19:35 -0700)]
test: Add basic test for network ping tracking

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 4fb42ea27e7b6acefd081b7b287d38347a6085ce)

5 years agoosd: Add debug_heartbeat_testing_span to allow quicker testing
David Zafman [Wed, 24 Jul 2019 21:19:43 +0000 (14:19 -0700)]
osd: Add debug_heartbeat_testing_span to allow quicker testing

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 573aea2bb1d48237df5182a6e4421e15c1eea88c)

5 years agoosd: Add debug_disable_randomized_ping config for use in testing
David Zafman [Wed, 24 Jul 2019 01:10:46 +0000 (18:10 -0700)]
osd: Add debug_disable_randomized_ping config for use in testing

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit f2b26d88f0a0727f0362ccd8b287f8bb3f41dc3c)

Conflicts:
src/osd/OSD.cc (trivial)
src/common/options.cc (trivial)

5 years agoosd mgr: Add osd_mon_heartbeat_stat_stale option to time out ping info
David Zafman [Mon, 22 Jul 2019 18:52:41 +0000 (11:52 -0700)]
osd mgr: Add osd_mon_heartbeat_stat_stale option to time out ping info
after 1 hour

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 048f8096265dd3a647adb970255e4b11c9617b2e)

Conflicts:
src/osd/OSD.cc (trivial)

5 years agomon: Indicate when an osd with slow ping time is down
David Zafman [Fri, 19 Jul 2019 04:29:49 +0000 (21:29 -0700)]
mon: Indicate when an osd with slow ping time is down

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 5ab145d6402a2525d69296de95b36214bc4c7431)

5 years agoosd mon: Add last_update to osd_stat_t heartbeat info
David Zafman [Fri, 19 Jul 2019 04:28:16 +0000 (21:28 -0700)]
osd mon: Add last_update to osd_stat_t heartbeat info

Ignore old heartbeat info which hasn't updated

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit ea20d3522aaf644cef989c565e11dd781e420e18)

Conflicts:
src/osd/osd_types.h (osd_stat_t location in file changed)

5 years agoosd: After first interval populate vectors so 5min/15min values aren't 0
David Zafman [Tue, 16 Jul 2019 19:02:43 +0000 (12:02 -0700)]
osd: After first interval populate vectors so 5min/15min values aren't 0

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 6555699d289769a44e9840424192a1be1a6ba00d)

5 years agoosd mgr: Store last pingtime for possible graphing
David Zafman [Mon, 15 Jul 2019 20:23:53 +0000 (13:23 -0700)]
osd mgr: Store last pingtime for possible graphing

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 3f846d7c806b7f62ead08f0e9fb2ba927ffe0592)

Conflicts:
src/osd/osd_types.h (osd_stat_t location in file changed)

5 years agoosd mgr: Add minimum and maximum tracking to network ping time
David Zafman [Fri, 12 Jul 2019 01:06:23 +0000 (01:06 +0000)]
osd mgr: Add minimum and maximum tracking to network ping time

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 297a0e7b1de410c094fc9a6e42be14813d6dac5e)

Conflicts:
src/osd/osd_types.cc (trivial)
src/osd/osd_types.h (osd_stat_t location in file changed)

5 years agodoc: Add documentation and release notes
David Zafman [Thu, 11 Jul 2019 00:05:47 +0000 (00:05 +0000)]
doc: Add documentation and release notes

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit f4a0be2e8707f921d65bf22a6c1090e402905ad3)

Conflicts:
PendingReleaseNotes (trivial)

5 years agoosd mgr mon: Add mon_warn_on_slow_ping_ratio config as 5% of osd_heartbeat_grace
David Zafman [Thu, 11 Jul 2019 21:24:12 +0000 (21:24 +0000)]
osd mgr mon: Add mon_warn_on_slow_ping_ratio config as 5% of osd_heartbeat_grace

Compute network ping threshold based on ratio (5% of 20 seconds is 1 second)
Make the threshold value used part of dump_osd_network for osd and mgr
Keep mon_warn_on_slow_ping_time (default 0) to optionally override the ratio

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 0d1bbd34e96e2da2027861229b376805d5ea8aa6)

5 years agomgr: Add "dump_osd_network" mgr admin request to get a sorted report
David Zafman [Tue, 9 Jul 2019 17:22:12 +0000 (17:22 +0000)]
mgr: Add "dump_osd_network" mgr admin request to get a sorted report

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 5d3c1856415f8b66e31361a0a7b9c75edc46e49e)

Conflicts:
src/mgr/ClusterState.cc (trivial)
src/mgr/ClusterState.h (trivial

5 years agoosd: Add "dump_osd_network" osd admin request to get a sorted report
David Zafman [Wed, 10 Jul 2019 18:15:44 +0000 (18:15 +0000)]
osd: Add "dump_osd_network" osd admin request to get a sorted report

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 025b10a5329127734367a6899543f51cd8580d43)

 Conflicts:
src/osd/OSD.cc (trivial)

5 years agoosd mon: Track heartbeat ping times and report health warning
David Zafman [Wed, 26 Jun 2019 02:59:06 +0000 (02:59 +0000)]
osd mon: Track heartbeat ping times and report health warning

Fixes: http://tracker.ceph.com/issues/40640
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 66d44e7f911a57100d650ad7df9445f88ec70140)

Conflicts:
src/common/options.cc (trivial)
src/mon/PGMap.cc (trivial)
src/osd/OSD.cc (trivial)
src/osd/OSD.h (trivial)
src/osd/osd_types.cc (encode version difference)
src/osd/osd_types.h (osd_stat_t location in file changed)

src/mon/PGMap.cc manually get rid of extra argument to checks->add
src/osd/OSD.cc rename ping_stamp to stamp for backport

5 years agoosd/OSD: fix HeartbeatInfo.is_healthy() check
xie xingguo [Mon, 8 Jan 2018 07:02:58 +0000 (15:02 +0800)]
osd/OSD: fix HeartbeatInfo.is_healthy() check

Delay to declared to be healthy until we have received the first
replies from both front and back connections.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit d9123158d1fef329fb9bf5ff787f9c84bb51b44c)

5 years agoosd/OSD: use first_tx to calculate failed_for
xie xingguo [Mon, 8 Jan 2018 02:24:09 +0000 (10:24 +0800)]
osd/OSD: use first_tx to calculate failed_for

If we never hear any replies from a heartbeat peer, use first_tx
to calculdate failed_for, which is more accurate.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit aba603736cbce94f7e1e5ac851ae4d4f43ea63e6)

5 years agoosd: refactor heartbeat health check
xie xingguo [Mon, 16 May 2016 05:50:28 +0000 (13:50 +0800)]
osd: refactor heartbeat health check

The original logic will reuse the timestamp which we send pings to
the specific heartbeat peer to update the last_rx_front[back] field
on receiving the corresponding replies, which later shall be honoured
as the exact time we succeed in getting the corresponding replies and
is used to calculate the heartbeat latency and determine whether the
relevant peer is dead.

However this is not accurate enough as there may be a delay between
we receive a reply and call heartbeat_check(). We can eliminate
the delay by introducing a map to track the ping-history here,
each entry of which consists of three elements:

1. "tx_time", worked as the map key, indicates the exact timestamp
   we send pings.
2. "deadline", indicates we shall receive all replies by then,
   otherwise we consider this peer as "dead".
3. "unacknowledged", indicates how many pings for the corresponding
   ping are still unacknowledged. The initial value is 2(as we send
   two pings from the front and back side for each peer).

We insert an item into the map on every time we sending out a ping, and
decrease the "unacknowledged" counter by 1 each time we get a reply from
the tracked ping. If "unacknowledged" drops to 0, we know all the replies
have been successfully collected and we can safely erase the relevant
item from the map as well as the earlier sent ones,  if there is any.

By comparing the current timestamp with the oldest deadline, we can now
make a much accurate decision about whether the corresponding peer is
healthy or not. And by setting last_rx_* to the timestamp we receiving
the reply, the lower bound when we can no longer hear a reply from the
corresponding connection is also much clear now.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 477774ceee42641f6d6884536462f92567bfea11)

Conflicts:
src/osd/OSD.cc (send_still_alive() has 1 less argument)