Kefu Chai [Fri, 15 Nov 2019 09:55:53 +0000 (17:55 +0800)]
script/run-cbt.sh: set fs.aio-max-nr for seastar
seastar requires 11027 for each reactor, so we need at least
fs.aio-max-nr = 11027 * n_osd
but to make it simple and to take the needs of other applications using
aio on the machine, we increase the number of io context to 32768,
and always set this setting no matter if we are starting a vstart
cluster with classic osd or crimson osd
Carlos Valiente [Thu, 14 Nov 2019 18:27:50 +0000 (18:27 +0000)]
src/msg/async/net_handler.cc: Fix compilation
On a Cray system I'm working on, it seems that `SO_PRIORITY` is defined,
but `IPTOS_CLASS_CS6` is not. Without this patch, compilation fails at
line 150:
```
r = ::setsockopt(sd, SOL_SOCKET, SO_PRIORITY, &prio, sizeof(prio));
```
because the variable `r` is not defined (originally it is defined inside
the `#ifdef IPTOS_CLASS_CS6` block).
Fixes: https://tracker.ceph.com/issues/42821 Signed-off-by: Carlos Valiente <carlos.valiente@ecmwf.int>
Sage Weil [Thu, 14 Nov 2019 16:33:24 +0000 (10:33 -0600)]
Merge PR #31584 into master
* refs/pull/31584/head:
common/options: osd_crush_chooseleaf_type is CLUSTER_CREATE
mon/ConfigMonitor: do not assimilate CLUSTER_CREATE options
common/ceph_context: observe container_image so we don't get warnings
Patrick Donnelly [Thu, 14 Nov 2019 14:14:11 +0000 (06:14 -0800)]
Merge PR #30754 into master
* refs/pull/30754/head:
doc/cephfs: merge fstab doc with respective mount docs
doc: add systemd unit part for FUSE mounts in fstab doc
doc: update and improve "mount using kernel driver" doc
doc: update and improve "mount using FUSE" doc
Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Rishabh Dave [Wed, 13 Nov 2019 13:22:18 +0000 (18:52 +0530)]
test_cephfs_shell: initialize stderr for run_cephfs_shell_cmd()
Since teuthology initializes stderr to None by default, absence of this
breaks the tests accessing stderr of commands executed within the test
when the execution is using teuthology.
Fixes: https://tracker.ceph.com/issues/42806 Signed-off-by: Rishabh Dave <ridave@redhat.com>
Patrick Donnelly [Tue, 12 Nov 2019 22:50:52 +0000 (14:50 -0800)]
global: disable THP for Ceph daemons
Ceph is known to suffer from memory fragmentation due to transparent
huge pages (THP). This is indicated by RSS usage above configured memory
targets and is only observable when the distribution default for THP is
"always", which is the default in the upstream kernel if
CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is unset.
For now, enabling THP is currently discouraged until selective use of
THP by Ceph is implemented via madvise. We will need to consider both
defaults for THP so madvise calls to both enable and disable THP will
need implemented.
All credit to Mark Nelson for doing the legwork identifying this issue
and potential solutions.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> Signed-off-by: Mark Nelson <mnelson@redhat.com>
Kefu Chai [Wed, 13 Nov 2019 12:41:05 +0000 (20:41 +0800)]
install-deps.sh: remove the cleanup for kitware repo
it was used for removing the kitware repo, and in 7265b55d094a639be50a567d3be92fba94c04786, we switched to a rebuilt
version hosted in chacra, and it was more than two months ago,
presumably, none builder has kitware repo anymore. so, let's remove
the cleanup code.
the source deb package comes from
http://ppa.launchpad.net/jonathonf/binutils/ubuntu
i tried to use the binutils 2.30 source package from bionic,
but it has build dependency of dpkg-dev (>= 1.19.0.5) which cannot be
fulfilled on xenial. so without more changes, we cannot get
binutils 2.30 built on xenial using this source package.
Jan Fajerski [Wed, 13 Nov 2019 09:13:01 +0000 (10:13 +0100)]
ceph-volume: assume msgrV1 for all branches containing mimic
With nautilus and newer OSDs listen on v1 ports and v2 ports. Assume
that if mimic (or luminous) occur in the branch name, the OSDs are
running msgrv1 only.
Fixes: https://tracker.ceph.com/issues/42791 Signed-off-by: Jan Fajerski <jfajerski@suse.com>
Yingxin Cheng [Wed, 6 Nov 2019 09:34:32 +0000 (17:34 +0800)]
crimson: build seastar dpdk from src/seastar/dpdk
src/spdk/dpdk and src/seastar/dpdk are both at their private branches
with project-specific modifications, so select proper dpdk source
directory according to flags WITH_SPDK and Seastar_DPDK.
Rishabh Dave [Mon, 4 Nov 2019 14:05:29 +0000 (19:35 +0530)]
doc: add systemd unit part for FUSE mounts in fstab doc
To make FUSE-mounted CephFS persist across reboots, user also needs to
start and enable the systemd units. Add that part to the document for
fstab, instead of mentioning it in "Mount CephFS using FUSE" doc. Also,
wrap few lines and rename mountpoint to /mnt/mycephfs in examples to
keep them same across docs.
Fixes: https://tracker.ceph.com/issues/42298 Signed-off-by: Rishabh Dave <ridave@redhat.com>
Rishabh Dave [Tue, 8 Oct 2019 06:59:56 +0000 (12:29 +0530)]
doc: update and improve "mount using kernel driver" doc
Move the examples of mount command for Ceph cluster with CephX enabled
to the top of page, since it is enabled by default, improve explanation
around Ceph with multiple FSs, get rid of hash symbols before every
command (without them it's clear that the text is command and with them
the reader cannot use the commands directly), link fstab page, add how
mount in general looks, add prerequisites required for kernel mounts and
expand explanation wherever possible.
Fixes: https://tracker.ceph.com/issues/42220 Signed-off-by: Rishabh Dave <ridave@redhat.com>
Rishabh Dave [Mon, 7 Oct 2019 07:27:29 +0000 (12:57 +0530)]
doc: update and improve "mount using FUSE" doc
Recommend keyring permission to be 600 instead of 644, show examples
for `-k`, `-r`, `-m` and `--client_mds_namespace` options, move
references to the bottom of the page, show how to unmount FUSE-mounted
CephFS, copy the tip about unmounting from "mount using kernel" page to
"mount using FUSE" page, correct commands for automating FUSE mounts,
add sub-headings to the document and add how ceph-fuse command looks in
general.
Fixes: https://tracker.ceph.com/issues/42205 Signed-off-by: Rishabh Dave <ridave@redhat.com>
luo rixin [Tue, 12 Nov 2019 08:36:53 +0000 (16:36 +0800)]
mon/PGMap: fix incorrect pg_pool_sum when delete pool
We found the pools num diplayed by "ceph -s" is not the same with
"ceph osd lspools" after deleting a pool sometime. The result is
Mgr ClusterState::ingest_pgstats get the old pg_stat which pg is
not deleted in some osd before the pool deleted and add to
pending_inc.pool_statfs_updates. The deleted pool will be added to
pg_pool_sum unconsciously by PGMap::apply_incremental and which has
been deleted in OSDMap. This will also casue MON's Segmentation
fault.
Sage Weil [Tue, 12 Nov 2019 23:47:51 +0000 (17:47 -0600)]
common/ceph_context: observe container_image so we don't get warnings
This gets rid of messages like
2019-11-12T23:46:28.578+0000 7f9ab2b70700 -1 set_mon_vals failed to set container_image = cephci/daemon-base:wip-sage2-testing-2019-11-11-1737-4ea2bc7-centos-7-x86_64-devel: Configuration option 'container_image' may not be modified at runtime