]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
5 years agoqa/tasks/ceph: keep mon addrs in ctx namespace 31461/head
Sage Weil [Wed, 19 Dec 2018 03:18:31 +0000 (21:18 -0600)]
qa/tasks/ceph: keep mon addrs in ctx namespace

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 545df766bef04b1a70380ac04909b2a88521e4da)

5 years agoqa/tasks/mon_seesaw: make get_mon_status use mon addr
Nathan Cutler [Thu, 7 Nov 2019 12:37:09 +0000 (13:37 +0100)]
qa/tasks/mon_seesaw: make get_mon_status use mon addr

We don't have the 'mon addr' config property any more.

This commit cannot be cherry-picked from master because qa/tasks/mon_seesaw.py
was dropped in nautilus.

Signed-off-by: Nathan Cutler <ncutler@suse.com>
5 years agoqa/tasks/ceph_manager: make get_mon_status use mon addr
Sage Weil [Wed, 19 Dec 2018 03:18:57 +0000 (21:18 -0600)]
qa/tasks/ceph_manager: make get_mon_status use mon addr

We don't have the 'mon addr' config property any more.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit ac2430a43ddec469575a25be4aff75ce1628eee2)

5 years agoMerge pull request #31324 from idryomov/wip-krbd-lvcreate-args-mimic
Yuri Weinstein [Wed, 6 Nov 2019 15:58:34 +0000 (07:58 -0800)]
Merge pull request #31324 from idryomov/wip-krbd-lvcreate-args-mimic

mimic: qa: krbd_msgr_segments.t: filter lvcreate output

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
5 years agoMerge pull request #31322 from idryomov/wip-krbd-udev-fixes-mimic
Yuri Weinstein [Wed, 6 Nov 2019 15:32:27 +0000 (07:32 -0800)]
Merge pull request #31322 from idryomov/wip-krbd-udev-fixes-mimic

mimic: krbd: avoid udev netlink socket overrun and retry on transient errors from udev_enumerate_scan_devices()

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
5 years agoMerge pull request #31424 from smithfarm/wip-fix-get-mons-mimic
Nathan Cutler [Wed, 6 Nov 2019 13:24:43 +0000 (14:24 +0100)]
Merge pull request #31424 from smithfarm/wip-fix-get-mons-mimic

mimic: tests: qa/tasks/ceph.py: pass cluster_name to get_mons

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
5 years agoqa/tasks/ceph.py/create_simple_monmap: use split_role 31424/head
Nathan Cutler [Wed, 6 Nov 2019 07:00:45 +0000 (08:00 +0100)]
qa/tasks/ceph.py/create_simple_monmap: use split_role

Signed-off-by: Nathan Cutler <ncutler@suse.com>
5 years agoqa/tasks/ceph.py: pass cluster_name to get_mons
Nathan Cutler [Tue, 5 Nov 2019 18:10:15 +0000 (19:10 +0100)]
qa/tasks/ceph.py: pass cluster_name to get_mons

This cannot be cherry-picked from master because it fixes an issue that was
introduced directly into mimic by a bad backport.

Fixes: 276c2b80fd9ce58c5249001bfc98fd84249782cb
Fixes: https://tracker.ceph.com/issues/42658
Signed-off-by: Nathan Cutler <ncutler@suse.com>
5 years agoMerge pull request #31362 from jan--f/c-v-missing-mimic-lvm-backports
Jan Fajerski [Tue, 5 Nov 2019 07:40:04 +0000 (08:40 +0100)]
Merge pull request #31362 from jan--f/c-v-missing-mimic-lvm-backports

More missing mimic backports

5 years agoMerge pull request #31275 from dzafman/wip-network-fix
Yuri Weinstein [Mon, 4 Nov 2019 21:25:00 +0000 (13:25 -0800)]
Merge pull request #31275 from dzafman/wip-network-fix

mimic: core: osd: Fix for compatibility of encode/decode of osd_stat_t

Reviewed-by: Sage Weil <sage@redhat.com>
5 years agoMerge pull request #31013 from smithfarm/wip-42391-mimic
Yuri Weinstein [Mon, 4 Nov 2019 20:36:02 +0000 (12:36 -0800)]
Merge pull request #31013 from smithfarm/wip-42391-mimic

mimic: mgr/balancer: python3 compatibility issue

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Mykola Golub <mgolub@mirantis.com>
5 years agoMerge pull request #31029 from smithfarm/wip-42198-mimic
Yuri Weinstein [Mon, 4 Nov 2019 20:35:06 +0000 (12:35 -0800)]
Merge pull request #31029 from smithfarm/wip-42198-mimic

mimic: osd/PrimaryLogPG: skip obcs that don't exist during backfill scan_range

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
5 years agoMerge pull request #31035 from smithfarm/wip-40503-mimic
Yuri Weinstein [Mon, 4 Nov 2019 20:34:38 +0000 (12:34 -0800)]
Merge pull request #31035 from smithfarm/wip-40503-mimic

mimic: core: osd: rollforward may need to mark pglog dirty

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
5 years agoMerge pull request #31096 from smithfarm/wip-42394-mimic
Yuri Weinstein [Mon, 4 Nov 2019 20:34:04 +0000 (12:34 -0800)]
Merge pull request #31096 from smithfarm/wip-42394-mimic

mimic: common/ceph_context: avoid unnecessary wait during service thread shutdown

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
5 years agoMerge pull request #31108 from k0ste/mimic_backports
Yuri Weinstein [Mon, 4 Nov 2019 20:33:26 +0000 (12:33 -0800)]
Merge pull request #31108 from k0ste/mimic_backports

mimic: mgr/prometheus: Cast collect_timeout (scrape_interval) to float

Reviewed-by: Jan Fajerski <jfajerski@suse.com>
5 years agoMerge pull request #31273 from badone/wip-mimic-restful-node-items
Yuri Weinstein [Mon, 4 Nov 2019 20:32:58 +0000 (12:32 -0800)]
Merge pull request #31273 from badone/wip-mimic-restful-node-items

mimic: restful: Query nodes_by_id for items

Reviewed-by: Boris Ranto <branto@redhat.com>
5 years agoMerge pull request #31285 from smithfarm/wip-42582-mimic
Yuri Weinstein [Mon, 4 Nov 2019 20:31:43 +0000 (12:31 -0800)]
Merge pull request #31285 from smithfarm/wip-42582-mimic

mimic: tests: install python3-cephfs for fs suite

Reviewed-by: Kefu Chai <kchai@redhat.com>
5 years agoceph-volume: add option to specify a pv for lvcreate 31362/head
Mohamad Gebai [Sun, 31 Mar 2019 17:05:35 +0000 (13:05 -0400)]
ceph-volume: add option to specify a pv for lvcreate

Signed-off-by: Mohamad Gebai <mgebai@suse.com>
(cherry picked from commit 38b2d7a66c76d9c5b1d2e00ada9503f22a7d8fb6)

5 years agoceph-volume: fix typos
Kefu Chai [Tue, 18 Sep 2018 03:22:14 +0000 (11:22 +0800)]
ceph-volume: fix typos

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit a4ece9cae60c437b7e01db21c9c42363865c295c)

5 years agoMerge pull request #31229 from jan--f/wip-42541-mimic
Jan Fajerski [Sat, 2 Nov 2019 14:20:26 +0000 (15:20 +0100)]
Merge pull request #31229 from jan--f/wip-42541-mimic

mimic: ceph-volume: api/lvm: check if list of LVs is empty

5 years agoqa: krbd_msgr_segments.t: filter lvcreate output 31324/head
Ilya Dryomov [Thu, 21 Jun 2018 15:27:59 +0000 (17:27 +0200)]
qa: krbd_msgr_segments.t: filter lvcreate output

Some versions of lvm emit a log message

  Using default stripesize 64.00 KiB.

which fails the test.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 645252d732c48f4b6da35d2dc5ebe4594ae5f389)

5 years agoqa: add script to stress udev_enumerate_scan_devices() 31322/head
Ilya Dryomov [Tue, 8 Oct 2019 18:12:30 +0000 (20:12 +0200)]
qa: add script to stress udev_enumerate_scan_devices()

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit b7a0e2adcbd139dae8338e23d5752d42015fa0ad)

5 years agokrbd: retry on an empty list from udev_enumerate_scan_devices()
Ilya Dryomov [Thu, 24 Oct 2019 15:35:23 +0000 (17:35 +0200)]
krbd: retry on an empty list from udev_enumerate_scan_devices()

systemd 219 doesn't have the issue that is worked around in the
previous commit, but has a different one: udev_enumerate_scan_devices()
always succeeds, but sometimes returns an empty list when the device is
actually there.  This happens rarely and at random so I haven't been
able to get to the bottom of it yet, but it looks like another similar
race condition in libudev.

Since an empty list is expected if the device isn't there, retry just
twice with a small sleep in-between.  This appears to be enough: I got
7 occurrences per 600000 "rbd unmap" invocations, all of which needed
a single retry:

  rbd: udev enumerate missed a device, tries = 1

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit bd37a72e0ec783a1ba91e63e8d96f0bc06007060)

Conflicts:
src/krbd.cc [ krbd_spec not in mimic ]

5 years agokrbd: retry on transient errors from udev_enumerate_scan_devices()
Ilya Dryomov [Mon, 7 Oct 2019 13:32:39 +0000 (15:32 +0200)]
krbd: retry on transient errors from udev_enumerate_scan_devices()

udev_enumerate_scan_devices() doesn't handle disappearing devices well.
If called while some devices are being removed, it sometimes propagates
ENOENT and ENODEV errors encountered operating on directory entries in
/sys that no longer exist.  Some of these errors are suppressed, but
this isn't reliable and varies across versions.  In particular, systemd
239 suppresses ENODEV from sd_device_new_from_syspath() but doesn't
suppress ENODEV from sd_device_get_devnum().  In systemd 243 the call
to sd_device_get_devnum() has been moved, but it still leaks ENOENT
from sd_device_get_is_initialized() (referring to the body of
FOREACH_DIRENT_ALL loop in enumerator_scan_dir_and_add_devices()).

Assume that all ENOENT and ENODEV errors are transient and retry the
call to udev_enumerate_scan_devices().  Don't limit the number, but log
each retry.

Fixes: https://tracker.ceph.com/issues/41036
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit e5921ef4a89f497a0bff6510fce0bb5c242d6172)

Conflicts:
src/krbd.cc [ rbd namespaces not in mimic ]

5 years agoqa: add script to test udev event reaping
Ilya Dryomov [Fri, 11 Oct 2019 12:58:08 +0000 (14:58 +0200)]
qa: add script to test udev event reaping

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 898c113f93a05a18f47f2dd6a94f7cf56c858185)

5 years agokrbd: increase udev netlink socket receive buffer to 2M
Ilya Dryomov [Mon, 14 Oct 2019 10:40:43 +0000 (12:40 +0200)]
krbd: increase udev netlink socket receive buffer to 2M

Even though with the previous commit we no longer block between binding
the socket and starting handling events, we still want a larger receive
buffer to accommodate for scheduling delays.  Since the filtering is
done in the listener, an estimate focused on just rbd is not accurate,
but anyway: a pair of "rbd" and "block" events for "rbd map" take 2048
bytes in the receive buffer.  This allows for roughly a thousand of
them ("rbd map" and "rbd unmap" require root and libudev makes use of
SO_RCVBUFFORCE so rmem_max limit is ignored).

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 1c6cac1acaefdf59c3265d70c8d2191c59f14652)

5 years agokrbd: avoid udev netlink socket overrun
Ilya Dryomov [Thu, 26 Sep 2019 16:06:27 +0000 (18:06 +0200)]
krbd: avoid udev netlink socket overrun

Because the event(s) we are interested in can be deliveled while we are
still in the kernel finishing map or unmap, we start listening for udev
events before going into the kernel.  However, if (un)mapping takes its
time, udev netlink socket can be fairly easily overrun -- the filtering
is done on the listener side, so we get to process everything, not just
rbd events.  If any of the events of interest get dropped (ENOBUFS), we
hang in poll().

Go into the kernel in a separate thread and leave the main thread to
run the event loop.  The return value is communicated to the reactor
though a pipe.

Fixes: https://tracker.ceph.com/issues/41404
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 5444a1111523bc100bea60958b2671674f6208ac)

Conflicts:
src/krbd.cc [ krbd_spec, ceph_abort_msgf() not in mimic ]

5 years agokrbd: reap all available events before polling again
Ilya Dryomov [Thu, 10 Oct 2019 11:49:26 +0000 (13:49 +0200)]
krbd: reap all available events before polling again

This also exposes errors from udev_monitor_receive_device() which were
previously ignored.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 53aab34dafcca2ec022102a03905e59cfa34fc84)

5 years agokrbd: separate event reaping from event processing
Ilya Dryomov [Thu, 10 Oct 2019 08:49:17 +0000 (10:49 +0200)]
krbd: separate event reaping from event processing

Move event processing into UdevMapHandler and UdevUnmapHandler
functors and replace wait_for_udev_{add,remove}() with a single
wait_for_mapping() template.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit c84f9e2f2df47361d7a928d0b25cb84ef332c055)

Conflicts:
src/krbd.cc [ krbd_spec not in mimic ]

5 years agokrbd: get rid of poll() timeout
Ilya Dryomov [Fri, 27 Sep 2019 15:14:08 +0000 (17:14 +0200)]
krbd: get rid of poll() timeout

This timeout was added as a (very poor) workaround for an issue
addressed in commit 42dd1eae630f ("krbd: fix rbd map hang due to udev
return subsystem unordered").

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit ffb66ff7106b9d200a7da355199ab362fb611c31)

Conflicts:
src/krbd.cc [ ceph_abort_msgf() not in mimic ]

5 years agocommon/thread: Fix race condition in make_named_thread
Adam C. Emerson [Tue, 22 Oct 2019 15:39:20 +0000 (11:39 -0400)]
common/thread: Fix race condition in make_named_thread

The thread may well no longer exist by the time we try to set the
name, so have the thread set its own name first thing.

Thanks to Ilya Dryomov <idryomov@gmail.com> for pointing it out.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 2bd106ec0da52e7fcf616d7b3cb20d570c1a5c50)

5 years agoMerge pull request #31236 from dzafman/wip-38282-mimic
Yuri Weinstein [Thu, 31 Oct 2019 16:40:39 +0000 (09:40 -0700)]
Merge pull request #31236 from dzafman/wip-38282-mimic

mimic: osd: fix build_incremental_map_msg

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
5 years agomsg/msg_types.h: do not cast `ceph_entity_name` to `entity_name_t` for printing 31275/head
Kefu Chai [Thu, 7 Feb 2019 13:13:14 +0000 (21:13 +0800)]
msg/msg_types.h: do not cast `ceph_entity_name` to `entity_name_t` for printing

in GCC-9, `-Waddress-of-packed-member` is enabled, so we have warnings like:

src/msg/msg_types.h:142:41: warning: converting a packed 'const
ceph_entity_name' pointer (alignment 1) to a 'const entity_name_t'
pointer (alignment 8) may result in an unaligned pointer value
[-Waddress-of-packed-member]
  142 |   return out << *(const entity_name_t*)&addr;
      |                                         ^~~~

since the alignment of these two structures are different, we cannot
cast a structure with the alignment of 1 to a structure with the
alignment of 8. as the code generated by compiler accessing the members
of alignment 8 won't work with the members of alignment 1, we need to
create a temporary structure for printing it.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit f1bfe9dbad669faacfde4e74f38fe92253e5a91e)

5 years agoosd: Fix for compatibility of encode/decode of osd_stat_t
David Zafman [Thu, 31 Oct 2019 01:25:24 +0000 (18:25 -0700)]
osd: Fix for compatibility of encode/decode of osd_stat_t

Signed-off-by: David Zafman <dzafman@redhat.com>
5 years agoqa/suites/fs: add python3-cephfs to packages 31285/head
Kefu Chai [Fri, 3 Aug 2018 09:27:20 +0000 (17:27 +0800)]
qa/suites/fs: add python3-cephfs to packages

the default set of packages to install is in
$suite/qa/packages/packages.yaml . see get_package_list() in
teuthology/teuthology/task/install/__init__.py for how we prepare a
package list for install task.

for running python3 tests in
fs/basic_functional/tasks/volume-client, we need to install
python3-cephfs. please note that,
_package_override() in teuthology/teutholoy/task/install/rpm.py will
take care of the different naming on centos/rhel, where the python3
packages are named python34-*.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 9754b3769bf07af1617ad48376769df016a58d9d)

Conflicts:
qa/cephfs/begin.yaml

5 years agoqa: do not install python3 packages in task.install
Kefu Chai [Fri, 3 Aug 2018 09:02:49 +0000 (17:02 +0800)]
qa: do not install python3 packages in task.install

This reverts commit c1efd59f618e24cf060d564ac0f21d5b0b57fd4a

task.install.rpm installs packages listed in
$suites/qa/packages/packages.yaml, the packge list applies to the
upgrade tests also. but we don't have python3 bindings packages in jewel
-- they were introduced in kraken.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 7e5c85b604c8f9045152d37f54fae4246ea82737)

Conflicts:
qa/packages/packages.yaml

5 years agoMerge pull request #31211 from sebastian-philipp/mimic-ceph-volume-device_id
Jan Fajerski [Thu, 31 Oct 2019 10:48:09 +0000 (11:48 +0100)]
Merge pull request #31211 from sebastian-philipp/mimic-ceph-volume-device_id

mimic: ceph-volume: add Ceph's device id to inventory

5 years agoMerge pull request #31258 from jan--f/wip-41288-mimic
Nathan Cutler [Thu, 31 Oct 2019 10:01:43 +0000 (11:01 +0100)]
Merge pull request #31258 from jan--f/wip-41288-mimic

mimic: doc: update bluestore cache settings and clarify data fraction

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
5 years agorestful: Use node_id for _gather_leaf_ids 31273/head
Boris Ranto [Fri, 25 Oct 2019 12:24:19 +0000 (14:24 +0200)]
restful: Use node_id for _gather_leaf_ids

The _gather_leaf_ids function doesn't need the node structure, it only
needs the id.

Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit a325f28d93defbec48793060d6768204da94edd5)

5 years agorestful: Query nodes_by_id for items
Boris Ranto [Thu, 24 Oct 2019 14:54:05 +0000 (16:54 +0200)]
restful: Query nodes_by_id for items

The node dict that is passed to the _gather_leaf_ids function from the
_gather_osds function does not have 'items' in it. We also can't use
buckets at this point since those only exist for leaf nodes, not all
nodes.

We need to query the nodes_by_id dict to get 'items' for a node inside
the _gather_leaf_ids function instead.

Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit 4f17cbc8651c4b96f006eeabd62373a6cd992865)

5 years agoMerge pull request #31254 from alfredodeza/wip-rm42292-mimic
Yuri Weinstein [Thu, 31 Oct 2019 00:22:01 +0000 (17:22 -0700)]
Merge pull request #31254 from alfredodeza/wip-rm42292-mimic

mimic: qa/ceph-disk: use a Python2.7 compatible version of pytest

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
Reviewed-by: Nathan Cutler <ncutler@suse.com>
5 years agoMerge pull request #31227 from jan--f/c-v-missing-mimic-lvm-backports
Alfredo Deza [Wed, 30 Oct 2019 19:12:35 +0000 (15:12 -0400)]
Merge pull request #31227 from jan--f/c-v-missing-mimic-lvm-backports

Add some missing backports to mimic

Reviewed-by: Alfredo Deza <adeza@redhat.com>
5 years agodoc: update bluestore cache settings and clarify data fraction 31258/head
Jan Fajerski [Mon, 29 Apr 2019 12:52:27 +0000 (14:52 +0200)]
doc: update bluestore cache settings and clarify data fraction

Fixes: http://tracker.ceph.com/issues/39522
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit 9d8336a7f418fe2bb11361dd74a214403b1e5be7)

5 years agoqa/ceph-disk: use a Python2.7 compatible version of pytest 31254/head
Alfredo Deza [Fri, 25 Oct 2019 15:49:54 +0000 (11:49 -0400)]
qa/ceph-disk: use a Python2.7 compatible version of pytest

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit 149ce7af588b9c052e41d687e722fee9b7255d7c)

5 years agoMerge pull request #28452 from thmour/mimic_test
Yuri Weinstein [Tue, 29 Oct 2019 19:36:53 +0000 (12:36 -0700)]
Merge pull request #28452 from thmour/mimic_test

mimic: mds: stopping MDS with a large cache (40+GB) causes it to miss heartbeats

Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 years agoMerge pull request #30950 from sidharthanup/mds-evict-duplicate-mimic
Yuri Weinstein [Tue, 29 Oct 2019 19:35:58 +0000 (12:35 -0700)]
Merge pull request #30950 from sidharthanup/mds-evict-duplicate-mimic

mimic: mds: Fix duplicate client entries in eviction list

Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 years agoapi/lvm: rewrite a condition 31229/head
Rishabh Dave [Tue, 3 Sep 2019 13:06:23 +0000 (18:36 +0530)]
api/lvm: rewrite a condition

Create the list of logical volumes if the list passed in arguments is
empty and rewrite the condition to make it more readable.

Fixes: https://tracker.ceph.com/issues/41649
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit d1f1bfd3635501090f4069be59e0bcde94dd64ec)

5 years agoceph-volume: update volume's tags structure when setting tags 31227/head
Mohamad Gebai [Tue, 2 Apr 2019 10:45:02 +0000 (06:45 -0400)]
ceph-volume: update volume's tags structure when setting tags

Signed-off-by: Mohamad Gebai <mgebai@suse.com>
(cherry picked from commit 4a1198deffb0baf647a6a31e03cbfe98f011ff14)

5 years agoosd: fix build_incremental_map_msg 31236/head
Sage Weil [Wed, 13 Feb 2019 21:01:48 +0000 (15:01 -0600)]
osd: fix build_incremental_map_msg

We need to fall back to an old map if since (the peer's epoch) is *older*
than our oldest.  If it's newer, we have it, and can just send
incrementals.

Fixes: http://tracker.ceph.com/issues/38282
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 57a13adc8d0e34b4bb1a4022eacbb3de2636df53)

5 years agoceph-volume: add clear_tag function for LVs
Mohamad Gebai [Sun, 31 Mar 2019 17:06:23 +0000 (13:06 -0400)]
ceph-volume: add clear_tag function for LVs

Signed-off-by: Mohamad Gebai <mgebai@suse.com>
(cherry picked from commit 776d485af8b6225fd4059952df36e40ef0ad12b4)

5 years agoceph-volume: add reduce_vg function
Mohamad Gebai [Sun, 31 Mar 2019 17:04:40 +0000 (13:04 -0400)]
ceph-volume: add reduce_vg function

Signed-off-by: Mohamad Gebai <mgebai@suse.com>
(cherry picked from commit ce0184b5d7f24f2b3b6a9491e0f3c1c847b8c0e7)

5 years agoceph-volume: look for hidden partitions when populating lvs
Mohamad Gebai [Sun, 31 Mar 2019 17:04:10 +0000 (13:04 -0400)]
ceph-volume: look for hidden partitions when populating lvs

Signed-off-by: Mohamad Gebai <mgebai@suse.com>
(cherry picked from commit 463091e46ba4032f1b8d90a6770fd7e2d3277a74)

5 years agoceph-volume: set a 1G extent size when creating vgs
Andrew Schoen [Thu, 29 Nov 2018 19:44:07 +0000 (13:44 -0600)]
ceph-volume: set a 1G extent size when creating vgs

This allows us to create larger lvs than the default of 4m
and is easier to reason about when sizing the lvs as everythign is
reported as GBs.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 4a1b97efc87f3df15a39a76de074b4791f3528ca)

5 years agoMerge pull request #30225 from dzafman/wip-network-mimic
Yuri Weinstein [Tue, 29 Oct 2019 16:35:10 +0000 (09:35 -0700)]
Merge pull request #30225 from dzafman/wip-network-mimic

mimic: core: Health warnings on long network ping times

Reviewed-by: Neha Ojha <nojha@redhat.com>
5 years agoceph-volume: add Ceph's device id to inventory 31211/head
Sebastian Wagner [Fri, 18 Oct 2019 11:59:44 +0000 (13:59 +0200)]
ceph-volume: add Ceph's device id to inventory

This will benefit the orchestrator and dashboard to show a unified view of devices with SMART data

Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
(cherry picked from commit e70d6041c1a093ed5c2b77abe17e1ede533d9659)

5 years agoosd/OSD: auto mark heartbeat sessions as stale and tear them down 30225/head
xie xingguo [Wed, 26 Jun 2019 06:24:08 +0000 (14:24 +0800)]
osd/OSD: auto mark heartbeat sessions as stale and tear them down

The primary benefit is that the OSD doesn't need to keep a flood of
blocked heartbeat messages around in memory.
This prevents OSDs from accumulating heartbeat messages due to a
broken switch and then exhausting the whole node's memory:

Jun 11 04:19:26 host-192-168-9-12 kernel: [409881.137077] Out of memory:
Kill process 1471476 (ceph-osd) score 47 or sacrifice child
Jun 11 04:19:26 host-192-168-9-12 kernel: [409881.146054] Killed process
1471476 (ceph-osd) total-vm:4822548kB, anon-rss:3097860kB,
file-rss:2556kB, shmem-rss:0kB

Fixes: http://tracker.ceph.com/issues/40586
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 6cc90f363b8096d2d5fad30e57426d0cea9e3478)

Conflicts:
src/osd/OSD.cc (no boot_finisher.stop() and no lock_guard)
src/osd/OSD.h (trivial)

Fixed get_val() call in reset_heartbeat_peers()

5 years agomds: handle negative decay counter 28452/head
Patrick Donnelly [Sat, 2 Feb 2019 00:00:13 +0000 (16:00 -0800)]
mds: handle negative decay counter

Problem only exists in Luminous/Mimic.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry-picked from commit 5f23246)

5 years agotest/mds: fix Session cons call
Patrick Donnelly [Fri, 1 Feb 2019 18:07:58 +0000 (10:07 -0800)]
test/mds: fix Session cons call

Problem did not exist in master.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry-picked from commit 5ed5c51)

Conflicts:
src/test/mds/TestSessionFilter.cc

5 years agomds: simplify recall warnings
Patrick Donnelly [Mon, 28 Jan 2019 23:48:38 +0000 (15:48 -0800)]
mds: simplify recall warnings

Instead of a timeout and complicated decisions about whether the client is
releasing caps in an expeditious fashion, just use a DecayCounter that tracks
the number of caps we've recalled. This counter is decremented whenever the
client releases caps. If the counter passes a threshold, then we raise the
warning.

Similar reworking is done for the steady-state recall of client caps. Another
release DecayCounter is added so we can tell when the client is not releasing
any more caps.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit c0b3a11)

Conflicts:
PendingReleaseNotes
src/mds/Beacon.cc
src/mds/Server.cc
src/mds/SessionMap.cc
src/mds/SessionMap.h

5 years agomds: add extra details for cache drop output
Patrick Donnelly [Fri, 25 Jan 2019 23:59:13 +0000 (15:59 -0800)]
mds: add extra details for cache drop output

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 3bc093f)

Conflicts:
src/mds/Server.cc

5 years agoqa: test mds_max_caps_per_client conf
Patrick Donnelly [Fri, 25 Jan 2019 20:13:50 +0000 (12:13 -0800)]
qa: test mds_max_caps_per_client conf

That the MDS will not let a client sit above mds_max_caps_per_client caps.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 30aaa88)

5 years agomds: limit maximum number of caps held by session
Patrick Donnelly [Thu, 24 Jan 2019 22:23:08 +0000 (14:23 -0800)]
mds: limit maximum number of caps held by session

This is to prevent unsustainable situations where a client has so many
outstanding caps that a linear traversal/operation on the session's caps takes
unacceptable amounts of time.

Fixes: http://tracker.ceph.com/issues/38022
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 48ca097)

Conflicts:
PendingReleaseNotes
src/mds/Server.cc

5 years agomds: adapt drop cache for incremental recall
Patrick Donnelly [Thu, 24 Jan 2019 22:22:42 +0000 (14:22 -0800)]
mds: adapt drop cache for incremental recall

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 7244cae)

5 years agomds: recall caps incrementally
Patrick Donnelly [Wed, 23 Jan 2019 14:41:55 +0000 (06:41 -0800)]
mds: recall caps incrementally

As with trimming, use DecayCounters to throttle the number of caps we recall,
both globally and per-session.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit ef46216)

Conflicts:
PendingReleaseNotes
qa/suites/fs/bugs/client_trim_caps/tasks/trim-i22073.yaml
src/mds/Beacon.cc
src/mds/MDSDaemon.cc
src/mds/Server.cc
src/mds/Server.h
src/mds/SessionMap.cc
src/mds/SessionMap.h

5 years agomds: cleanup Session init
Patrick Donnelly [Mon, 21 Jan 2019 18:57:45 +0000 (10:57 -0800)]
mds: cleanup Session init

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit ce153b8)

Conflicts:
src/mds/SessionMap.cc
src/mds/SessionMap.h

5 years agomds: adapt drop cache for incremental trim
Patrick Donnelly [Sun, 20 Jan 2019 04:40:11 +0000 (20:40 -0800)]
mds: adapt drop cache for incremental trim

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit b750b3b)

5 years agomds: add throttle for trimming MDCache
Patrick Donnelly [Sat, 19 Jan 2019 00:18:59 +0000 (16:18 -0800)]
mds: add throttle for trimming MDCache

This is necessary when the MDS cache size decreases by a significant amount.
For example, when stopping a large MDS or when the operator makes a large cache
size reduction.

Fixes: http://tracker.ceph.com/issues/37723
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 7bf2f31)

Conflicts:
PendingReleaseNotes
src/mds/MDCache.cc
src/mds/MDCache.h

5 years agomds: cleanup SessionMap init
Patrick Donnelly [Fri, 18 Jan 2019 23:43:48 +0000 (15:43 -0800)]
mds: cleanup SessionMap init

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 69efdaf)

Conflicts:
    src/mds/SessionMap.h

5 years agomgr/prometheus: Cast collect_timeout (scrape_interval) to float 31108/head
Benjeman Meekhof [Mon, 29 Jul 2019 14:28:40 +0000 (10:28 -0400)]
mgr/prometheus: Cast collect_timeout (scrape_interval) to float

If set by user scrape_interval option is returned as non-float by get_localized_module_option.
Metric cache timeout comparison always returns true and data is never refreshed.

Fixes: https://tracker.ceph.com/issues/40997
Signed-off-by: Ben Meekhof <bmeekhof@umich.edu>
(cherry picked from commit 26a74a0d83e068b0bb762c4c7066b4b195187e94)

Conflicts:
- path: src/pybind/mgr/prometheus/module.py
  comment: get_localized_module_option() in master, get_localized_config() in mimic

5 years agocommon/ceph_context: avoid unnecessary wait during service thread shutdown 31096/head
Jason Dillaman [Tue, 15 Oct 2019 22:19:15 +0000 (18:19 -0400)]
common/ceph_context: avoid unnecessary wait during service thread shutdown

Fixes: https://tracker.ceph.com/issues/42332
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit e8249d3b28f8789b2d4aca6fb75c75103a5cbea1)

Conflicts:
src/common/ceph_context.cc
- Mutex::Locker

5 years agoMerge pull request #28585 from ukernel/mimic-40327
Yuri Weinstein [Wed, 23 Oct 2019 15:32:06 +0000 (08:32 -0700)]
Merge pull request #28585 from ukernel/mimic-40327

mimic: mds: change how mds revoke stale caps

Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 years agoMerge pull request #30841 from smithfarm/wip-42263-mimic
Yuri Weinstein [Wed, 23 Oct 2019 15:31:21 +0000 (08:31 -0700)]
Merge pull request #30841 from smithfarm/wip-42263-mimic

mimic: tests: do not take ceph.conf.template from ceph/teuthology.git

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
5 years agoMerge pull request #30918 from smithfarm/wip-42122-mimic
Yuri Weinstein [Wed, 23 Oct 2019 15:28:47 +0000 (08:28 -0700)]
Merge pull request #30918 from smithfarm/wip-42122-mimic

mimic: cephfs: client: add procession of SEEK_HOLE and SEEK_DATA in lseek.

Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 years agoMerge pull request #30979 from smithfarm/wip-41464-mimic
Yuri Weinstein [Wed, 23 Oct 2019 15:28:14 +0000 (08:28 -0700)]
Merge pull request #30979 from smithfarm/wip-41464-mimic

mimic: tools: ceph-objectstore-tool: update-mon-db: do not fail if incmap is missing

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
5 years agoMerge pull request #31017 from smithfarm/wip-40896-mimic-revert
Yuri Weinstein [Wed, 23 Oct 2019 15:27:29 +0000 (08:27 -0700)]
Merge pull request #31017 from smithfarm/wip-40896-mimic-revert

mimic: cephfs: Revert "ceph_volume_client: convert string to bytes object"

Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 years agoMerge pull request #29219 from smithfarm/wip-38875-mimic
Yuri Weinstein [Wed, 23 Oct 2019 15:25:53 +0000 (08:25 -0700)]
Merge pull request #29219 from smithfarm/wip-38875-mimic

mimic: mds: high debug logging with many subtrees is slow

Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 years agoMerge pull request #30932 from smithfarm/wip-42034-mimic
Yuri Weinstein [Wed, 23 Oct 2019 15:25:24 +0000 (08:25 -0700)]
Merge pull request #30932 from smithfarm/wip-42034-mimic

mimic: cephfs: client: EINVAL may be returned when offset is 0.

Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 years agoMerge pull request #30933 from smithfarm/wip-42038-mimic
Yuri Weinstein [Wed, 23 Oct 2019 15:24:59 +0000 (08:24 -0700)]
Merge pull request #30933 from smithfarm/wip-42038-mimic

mimic: cephfs: client: _readdir_cache_cb() may use the readdir_cache already clear

Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 years agoMerge pull request #31090 from smithfarm/wip-42416-mimic
Nathan Cutler [Wed, 23 Oct 2019 15:10:09 +0000 (17:10 +0200)]
Merge pull request #31090 from smithfarm/wip-42416-mimic

mimic: doc/rbd: s/guess/xml/ for codeblock lexer

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
5 years agodoc/rbd: s/guess/xml/ for codeblock lexer 31090/head
Kefu Chai [Wed, 16 Oct 2019 04:34:19 +0000 (12:34 +0800)]
doc/rbd: s/guess/xml/ for codeblock lexer

this change silences the warning of

```
doc/rbd/qemu-rbd.rst:174: WARNING: Pygments lexer name 'guess' is not
known
```

see http://pygments.org/docs/lexers/, we should use "xml" for XML .

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit df226da996e468d2707b08eb012d54b4e37ffdc6)

5 years agoMerge pull request #30775 from smithfarm/wip-41979-mimic
Yuri Weinstein [Tue, 22 Oct 2019 18:41:52 +0000 (11:41 -0700)]
Merge pull request #30775 from smithfarm/wip-41979-mimic

mimic: rgw: fix list versions starts with version_id=null

Reviewed-by: Casey Bodley <cbodley@redhat.com>
5 years agoMerge pull request #30868 from smithfarm/wip-41324-mimic
Yuri Weinstein [Tue, 22 Oct 2019 18:41:26 +0000 (11:41 -0700)]
Merge pull request #30868 from smithfarm/wip-41324-mimic

mimic: rgw: datalog/mdlog trim commands loop until done

Reviewed-by: Casey Bodley <cbodley@redhat.com>
5 years agoMerge pull request #30980 from smithfarm/wip-41496-mimic
Yuri Weinstein [Tue, 22 Oct 2019 18:40:59 +0000 (11:40 -0700)]
Merge pull request #30980 from smithfarm/wip-41496-mimic

mimic: rgw: fix the bug of rgw not doing necessary checking to website configuration

Reviewed-by: Casey Bodley <cbodley@redhat.com>
5 years agoMerge pull request #30891 from smithfarm/wip-41715-mimic
Yuri Weinstein [Tue, 22 Oct 2019 15:05:57 +0000 (08:05 -0700)]
Merge pull request #30891 from smithfarm/wip-41715-mimic

mimic: rgw: fix refcount tags to match and update object's idtag

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
5 years agoMerge pull request #30977 from theanalyst/wip-41570-mimic
Yuri Weinstein [Tue, 22 Oct 2019 15:05:09 +0000 (08:05 -0700)]
Merge pull request #30977 from theanalyst/wip-41570-mimic

mimic: rgw: asio: check the remote endpoint before processing requests

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
5 years agoqa/tasks/ceph.conf: do not warn on TOO_FEW_OSDS 30841/head
Sage Weil [Fri, 10 May 2019 19:45:22 +0000 (14:45 -0500)]
qa/tasks/ceph.conf: do not warn on TOO_FEW_OSDS

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 0483c1c3e7ffdfa6a6f65c5ef000c45d2f096428)

5 years agoosd: accident of rollforward may need to mark pglog dirty 31035/head
Zengran Zhang [Tue, 18 Jun 2019 03:32:33 +0000 (11:32 +0800)]
osd: accident of rollforward may need to mark pglog dirty

refers: https://github.com/ceph/ceph/pull/27015/files#r294114392

Fixes: http://tracker.ceph.com/issues/40403
Signed-off-by: Zengran Zhang <zhangzengran@sangfor.com.cn>
(cherry picked from commit 35cb184becd6562edd240553dfa50f47bb120b7f)

5 years agoOSD: rollforward may need to mark pglog dirty
Zengran Zhang [Sun, 17 Mar 2019 02:05:11 +0000 (10:05 +0800)]
OSD: rollforward may need to mark pglog dirty

if we rollforward at the end of PG::activate(), we may advance the *crt*,
but we did not mart the log dirty, this means we will not update the crt
within the transaction of rollforward, so it is inconsistent.

Signed-off-by: Zengran Zhang <zhangzengran@sangfor.com.cn>
(cherry picked from commit 10d0990dc69310864b4845ee57b32610a642464f)

Conflicts:
src/osd/PGLog.h
dfbe5e070cc978253abcb30b86de5faa7e6a1efc is not being backported
- retain !touched_log as part of conditional in is_dirty()

5 years agoMerge pull request #30713 from smithfarm/wip-40258-mimic
Yuri Weinstein [Mon, 21 Oct 2019 23:53:42 +0000 (16:53 -0700)]
Merge pull request #30713 from smithfarm/wip-40258-mimic

mimic: cmake: detect armv8 crc and crypto feature using CHECK_C_COMPILER_FLAG

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
5 years agoMerge pull request #30893 from smithfarm/wip-41964-mimic
Yuri Weinstein [Mon, 21 Oct 2019 23:52:41 +0000 (16:52 -0700)]
Merge pull request #30893 from smithfarm/wip-41964-mimic

mimic: tools/rados: list objects in a pg

Reviewed-by: Vikhyat Umrao <vikhyat@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
5 years agoMerge pull request #30898 from smithfarm/wip-42128-mimic
Yuri Weinstein [Mon, 21 Oct 2019 23:51:55 +0000 (16:51 -0700)]
Merge pull request #30898 from smithfarm/wip-42128-mimic

mimic: osd/OSDMap: do not trust partially simplified pg_upmap_item

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
5 years agoMerge pull request #30903 from smithfarm/wip-42154-mimic
Yuri Weinstein [Mon, 21 Oct 2019 23:51:30 +0000 (16:51 -0700)]
Merge pull request #30903 from smithfarm/wip-42154-mimic

mimic: mon/OSDMonitor: trim not-longer-exist failure reporters

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
5 years agoMerge pull request #30924 from vumrao/wip-vumrao-42240
Yuri Weinstein [Mon, 21 Oct 2019 23:50:40 +0000 (16:50 -0700)]
Merge pull request #30924 from vumrao/wip-vumrao-42240

mimic: osd/PG: Add PG to large omap log message

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
5 years agoMerge pull request #30846 from wido/mimic-42116
Yuri Weinstein [Mon, 21 Oct 2019 23:48:02 +0000 (16:48 -0700)]
Merge pull request #30846 from wido/mimic-42116

mimic: mgr/telemetry: Ignore crashes in report when module not enabled

Reviewed-by: Sage Weil <sage@redhat.com>
5 years agoMerge pull request #30895 from smithfarm/wip-42036-mimic
Yuri Weinstein [Mon, 21 Oct 2019 23:47:22 +0000 (16:47 -0700)]
Merge pull request #30895 from smithfarm/wip-42036-mimic

mimic: osd/PeeringState: recover_got - add special handler for empty log

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Reviewed-by: Neha Ojha <nojha@redhat.com>
5 years agoMerge pull request #30901 from smithfarm/wip-42137-mimic
Yuri Weinstein [Mon, 21 Oct 2019 23:46:52 +0000 (16:46 -0700)]
Merge pull request #30901 from smithfarm/wip-42137-mimic

mimic: osd: Remove unused osdmap flags full, nearfull from output

Reviewed-by: David Zafman <dzafman@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
5 years agoMerge pull request #30916 from smithfarm/wip-41457-mimic
Yuri Weinstein [Mon, 21 Oct 2019 23:45:59 +0000 (16:45 -0700)]
Merge pull request #30916 from smithfarm/wip-41457-mimic

mimic: osd: merge replica log on primary need according to replica log's crt

Reviewed-by: Neha Ojha <nojha@redhat.com>
5 years agoMerge pull request #30982 from tchaikov/wip-mimic-42362
Yuri Weinstein [Mon, 21 Oct 2019 23:45:30 +0000 (16:45 -0700)]
Merge pull request #30982 from tchaikov/wip-mimic-42362

mimic: build/ops: python3-cephfs should provide python36-cephfs

Reviewed-by: Nathan Cutler <ncutler@suse.com>
5 years agoMerge pull request #30991 from smithfarm/wip-37520-mimic-revert
Yuri Weinstein [Mon, 21 Oct 2019 23:44:32 +0000 (16:44 -0700)]
Merge pull request #30991 from smithfarm/wip-37520-mimic-revert

mimic: msg: Revert "msg/async: do not trigger RESETSESSION from connect fault during connection phase"

Reviewed-by: Ricardo Dias <rdias@suse.com>
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
5 years agoosd/PrimaryLogPG: skip obcs that don't exist during backfill scan_range 31029/head
Sage Weil [Thu, 3 Oct 2019 18:00:45 +0000 (13:00 -0500)]
osd/PrimaryLogPG: skip obcs that don't exist during backfill scan_range

We already skip objects we encounter that we do getattr() on and get
ENOENT, but sometimes the object is in our obc cache with exists=false.
Skip those too.

Fixes: https://tracker.ceph.com/issues/42177
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit b700c17ec053c8ffb178d6bd44edb2d643fe8fb6)