git.apps.os.sepia.ceph.com Git

]> git.apps.os.sepia.ceph.com Git - ceph.git/log

projects / ceph.git / log

commit | commitdiff | tree

Sage Weil [Wed, 26 May 2021 22:38:05 +0000 (18:38 -0400)]

mgr/nfs: use host.addr for backend IP where possible

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 25 May 2021 20:10:49 +0000 (16:10 -0400)]

mgr/cephadm: convert host addr if non-IP to IP

Previously we allowed the host.addr to be a DNS name (short or fqdn).
This is problematic because of the inconsistent way that docker and podman
handle /etc/hosts, and undesirable because relying on external DNS is
an external source of failure for the cluster without any benefit in
return (simply updating DNS is not sufficient to make ceph behave).

So: update any non-IP to an IP as soon as we start up (presumably on
upgrade). If we get a loopback address (127.0.0.1 or 127.0.1.1), then
wait and hope that the next instance of the manager has better luck.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 25 May 2021 17:00:35 +0000 (13:00 -0400)]

mgr/dashboard,prometheus: new method of getting mgr IP

- Use a centralized method get_mgr_ip()
- Look up the hostname via DNS. This is a bit more reliable than
getfqdn() since it will work even when podman adds the container
name to /etc/hosts.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 25 May 2021 16:14:39 +0000 (12:14 -0400)]

doc/cephadm: remove any reference to the use of DNS or /etc/hosts

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Fri, 21 May 2021 17:31:31 +0000 (13:31 -0400)]

mgr/cephadm: use known host addr

If the host IP/addr is known, use that. The addr might even be a FQDN
instead of an IP address, in which case we want to look that up instead
of the bare hostname.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Fri, 21 May 2021 16:32:49 +0000 (12:32 -0400)]

mgr/cephadm: resolve IP at 'orch host add' time

We prefer to always have a real IP for hosts in the cluster. This avoids
a reliance on DNS for most operations.

Perhaps more importantly, it means we are less sensitive to inconsistent
host lookup results, for example due to (1) mismatched /etc/hosts files
between machines, or (2) a lookup of the local hostname that returns
127.0.1.1.

Adjust with_hosts() fixture to take an addr, and adjust tests accordingly.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Adam C. Emerson [Wed, 26 May 2021 17:52:57 +0000 (13:52 -0400)]

Merge pull request #41465 from adamemerson/wip-50169

rgw: Simplify log shard probing and err on the side of omap

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 15:03:46 +0000 (23:03 +0800)]

Merge pull request #41554 from rzarzynski/wip-crimson-simplify-ox-lt-mgmt

crimson/osd: simplify the management of OpsExecuter's life-time.

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Wed, 26 May 2021 13:20:52 +0000 (13:20 +0000)]

crimson/osd: simplify the management of OpsExecuter's life-time.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Sage Weil [Wed, 26 May 2021 14:12:06 +0000 (10:12 -0400)]

Merge PR #41510 into master

* refs/pull/41510/head:
doc/cephfs/nfs: remove documented limitation

Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 10:44:31 +0000 (18:44 +0800)]

Merge pull request #41547 from t-msn/wip-update-cephspec

ceph.spec.in: install gcc-toolset-9-gcc-c++ for rhel only

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Misono Tomohiro [Wed, 26 May 2021 07:10:35 +0000 (16:10 +0900)]

ceph.spec.in: install gcc-toolset-9-gcc-c++ for rhel only

Otherwise fedora 33 complains there is no gcc-toolset-9-gcc-c++
when running "WITH_SEASTAR=true ./install_deps.sh"

Related to: 36759b53635
Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 07:18:22 +0000 (15:18 +0800)]

Merge pull request #41545 from tchaikov/wip-vstart-fix

vstart.sh: pass the addition option to parse_block_devs()

Reviewed-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 06:02:51 +0000 (14:02 +0800)]

vstart.sh: use || instead of "-o"

to silence the warning like:

SC2166: Prefer [ p ] || [ q ] as [ p -o q ] is not well defined.

see also
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 06:00:37 +0000 (14:00 +0800)]

vstart.sh: pass the addition option to parse_block_devs()

to address the regression introduced by
3ea5242e381a850c080ee9edbaeea28059ad4da9

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 05:55:03 +0000 (13:55 +0800)]

Merge pull request #41543 from runsisi/wip-fix-clay-doc

doc: add missing crush-device-class={device-class} pair for clay code profile

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 04:58:37 +0000 (12:58 +0800)]

Merge pull request #41542 from tchaikov/wip-vstart-cleanup

vstart: cleanups

Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 03:42:58 +0000 (11:42 +0800)]

Merge pull request #41536 from rzarzynski/wip-crimson-fix-ox-lifetime

crimson/osd: extend lifetime of OpsExecuter to match all_completed

Reviewed-by: Xuehan Xu <xuxuehan@360.cn>
Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 01:10:05 +0000 (09:10 +0800)]

vstart: define helper for parsing block_devs options

for better readability, and to simplify the code

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 01:06:19 +0000 (09:06 +0800)]

vstart: use block_devs for {blue,sea}store_dev

so the logic handling bluestore_dev and seastore_dev can be merged.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 00:56:19 +0000 (08:56 +0800)]

vstart: use here document to define usage

for better readability

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

luo.runbing [Wed, 26 May 2021 02:41:40 +0000 (10:41 +0800)]

doc: add missing crush-device-class={device-class} pair for clay code profile

`crush-device-class` is optional for `ceph osd erasure-code-profile set`,
add it for the sake of completeness

Signed-off-by: luo.runbing <luo.runbing@zte.com.cn>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 02:36:31 +0000 (10:36 +0800)]

Merge pull request #41484 from liewegas/cephadm-progress

mgr/cephadm: progress item for service apply

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 02:33:58 +0000 (10:33 +0800)]

Merge pull request #41452 from ifed01/wip-ifed-fix-no-track

os/bluestore: track bluestore_warn_on_no_per_pg_omap cfg parameter ch…

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 02:33:22 +0000 (10:33 +0800)]

Merge pull request #41453 from ifed01/wip-ifed-stray-omap

os/bluestore: improve stray omap logging during fsck.

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 02:32:14 +0000 (10:32 +0800)]

Merge pull request #41398 from aclamk/wip-aclamk-avl-block-picker-improve

os/bluestore: Improve _block_picker function

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 02:31:33 +0000 (10:31 +0800)]

Merge pull request #41488 from liewegas/cephadm-conf

config,mgr: expose ceph.conf path to mgr modules

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 02:30:08 +0000 (10:30 +0800)]

Merge pull request #41459 from liewegas/devid-underscores

common/blkdev: remove double _'s from device_id

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 02:28:49 +0000 (10:28 +0800)]

Merge pull request #41491 from tchaikov/wip-ceph-daemon-ann

pybind/ceph_daemon: add type annotations

Reviewed-by: Sebastian Wagner <swagner@suse.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 02:26:57 +0000 (10:26 +0800)]

Merge pull request #41493 from tchaikov/wip-pybind-impport-from-future

pybind: do not import from "__future__" anymore

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Sebastian Wagner <swagner@suse.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 02:23:03 +0000 (10:23 +0800)]

Merge pull request #41498 from tchaikov/wip-cmake-libasan.6

cmake/modules/FindSanitizers: prefer libasan.6

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 01:23:35 +0000 (09:23 +0800)]

Merge pull request #41401 from rzarzynski/wip-crimson-injectdataerr

crimson/osd, common: implement the inject{m,}dataerr admin commands

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 00:29:13 +0000 (08:29 +0800)]

Merge pull request #41541 from liu-chunmei/seastore-add-devs

crimson/seastore: add --seastore-devs in vstart.sh

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 00:18:27 +0000 (08:18 +0800)]

Merge pull request #41533 from tchaikov/wip-doc-rgw-conf

doc/radosgw: use confval directive to define options

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Wed, 19 May 2021 12:38:22 +0000 (12:38 +0000)]

crimson/osd: implement the injectmdataerr admin command.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Wed, 19 May 2021 12:30:23 +0000 (12:30 +0000)]

crimson/osd: implement the injectdataerr admin command.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Tue, 18 May 2021 14:38:50 +0000 (14:38 +0000)]

crimson/os: implement inject_{m,}data_error in AlienStore.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 00:03:46 +0000 (08:03 +0800)]

Merge pull request #41434 from tchaikov/wip-cmd-getval

common/cmdparse: use string_view for the key and return val by retval

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

chunmei-liu [Tue, 25 May 2021 20:34:00 +0000 (13:34 -0700)]

crimson/seastore: add --seastore-devs in vstart.sh

to support /dev/xxx as seastore device

Signed-off-by: chunmei-liu <chunmei.liu@intel.com>

commit | commitdiff | tree

Adam C. Emerson [Thu, 20 May 2021 23:19:55 +0000 (19:19 -0400)]

rgw: Simplify log shard probing and err on the side of omap

In the multigeneration version we no longer care whether entries
exist, since we never delete and recreate empty logs. Remove logic
that marked entirely empty shards as DNE under the assumption that
they would be deleted if so.

Fixes: https://tracker.ceph.com/issues/50169
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

commit | commitdiff | tree

Sage Weil [Tue, 25 May 2021 20:17:44 +0000 (16:17 -0400)]

Merge PR #41007 into master

* refs/pull/41007/head:
qa/tasks/cephfs/test_nfs: fix info test
doc/cephfs/fs-nfs-exports: document --ingress --virtual-ip
mgr/nfs: move ingress vs virtual_ip check to cluster interface
PendingReleaseNotes: clarify deprecated
PendingReleaseNotes: note breaking CLI changes
doc/cephadm/nfs: document nfs+ingress
qa/suites/rados/cephadm/smoke-roleless: test nfs, nfs + ingress
mgr/nfs: take --ingress argument to 'nfs cluster create'
mgr/cephadm: adjust debug output for device refresh
mgr/cephadm: ingress: fix log msg
mgr/cephadm: fix logging of config/placement errors
common/options: enable nfs module for new clusters
cephadm: --stop-signal=SIGTERM
mgr/orchestrator: default nfs pool, namespaces
mgr/cephadm: nfs: create pool if it doesn't yet exist
doc/cephadm/nfs: update
mgr/nfs: change 'nfs cluster info'
mgr/nfs: take optional virtual_ip for deploying ingress
mgr/nfs: remove 'nfs cluster update'
mgr/nfs: factor out ganesha pool creation
mgr/nfs: delete -> rm for CLI
mgr/nfs: add some type annotations
python-common: fix IngressSpec yaml dump
mgr/cephadm: ingress: remove eth0 default
qa/tasks/cephadm: allow mounting volumes in shell
cephadm: add -v arg to shell
qa/tasks/vip: add 'vip.exec' task
mgr/orchestrator: add --port arg to 'orch apply nfs'
mgr/cephadm: nfs: add purge
mgr/cephadm: ingress: support nfs
mgr/cephadm: do not reconfigure daemons on deleted services
mgr/cephadm: nfs: shell out to rados tool for conf creation
mgr/cephadm: nfs: add rank to grace file from mgr module
mgr/cephadm: nfs: bind ganesha to appropriate ip:port
mgr/cephadm: enable ranked daemons for nfs
mgr/cephadm: support creation of daemons with ranks
mgr/cephadm: make _plan show removed daemon names
mgr/cephadm/schedule: assign/map ranks
mgr/cephadm: add rank[_generation] properties
mgr/cephadm/inventory: store optional rank_map along with specs
mgr/cephadm: include service_name is generated DaemonDescription
mgr/orchestrator: include service_name in DaemonDescription dump
mgr/cephadm/inventory: fix deleted check
mgr/cephadm: simplify
mgr/cephadm/schedule: make placement shuffle deterministic
mgr/cephadm: document CephadmService flags

Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Varsha Rao <varao@redhat.com>

commit | commitdiff | tree

Sage Weil [Tue, 25 May 2021 20:17:21 +0000 (16:17 -0400)]

Merge PR #41539 into master

* refs/pull/41539/head:
doc/cephadm: fix prompts in service-management.rst

Reviewed-by: Sage Weil <sage@redhat.com>

commit | commitdiff | tree

Zac Dover [Tue, 25 May 2021 19:22:56 +0000 (05:22 +1000)]

doc/cephadm: fix prompts in service-management.rst

This PR formats the prompts in service-managment.rst
properly.

Signed-off-by: Zac Dover <zac.dover@gmail.com>

commit | commitdiff | tree

Kefu Chai [Wed, 19 May 2021 14:36:07 +0000 (22:36 +0800)]

doc/radosgw: use confval directive to define options

less repeating this way

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Tue, 25 May 2021 14:17:41 +0000 (14:17 +0000)]

crimson/osd: extend lifetime of OpsExecuter to match all_completed.

f7181ab2f65803ecd8204f8f4f5aad4713b747f3 has optimized the client
parallelism. To achieve that `PG::do_osd_ops()` were converted to
return basically future pair of futures. Unfortunately, the life-
time management of `OpsExecuter` was kept intact. In the result,
the object was valid only till fullfying the outer future while,
due to the `rollbacker` instances, it should be available till
`all_completed` becomes available.

This issue can explain the following problem has been observed
in a Teuthology job [1].

```
DEBUG 2021-05-20 08:03:22,617 [shard 0] osd - do_op_call: method returned ret=-17, outdata.length()=0 while num_read=1, num_write=0
DEBUG 2021-05-20 08:03:22,617 [shard 0] osd - rollback_obc_if_modified: object 19:e17d4708:test-rados-api-smithi095-38404-2::foo:head got erro
r generic:17, need_rollback=false
=================================================================
==33626==ERROR: AddressSanitizer: heap-use-after-free on address 0x60d0000b9320 at pc 0x560f486b8222 bp 0x7fffc467a1e0 sp 0x7fffc467a1d0
READ of size 4 at 0x60d0000b9320 thread T0
    #0 0x560f486b8221  (/usr/bin/ceph-osd+0x2c610221)
    #1 0x560f4880c6b1 in seastar::continuation<seastar::internal::promise_base_with_type<boost::intrusive_ptr<MOSDOpReply> >, seastar::noncopy
able_function<crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_
wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_future<crimson::errorated_future_marker<boost::intrusive_ptr<MOSDOpReply> > >
> ()>, seastar::future<void>::then_impl_nrvo<seastar::noncopyable_function<crimson::interruptible::interruptible_future_detail<crimson::osd::
IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_future<crimson:
:errorated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > ()>, crimson::interruptible::interruptible_future_detail<crimson::osd::IOInte
rruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_future<crimson::error
ated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > >(seastar::noncopyable_function<crimson::interruptible::interruptible_future_detail
<crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_
future<crimson::errorated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > ()>&&)::{lambda(seastar::internal::promise_base_with_type<boos
t::intrusive_ptr<MOSDOpReply> >&&, seastar::noncopyable_function<crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_future<crimson::errorated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > ()>&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>::run_and_dispose() (/usr/bin/ceph-osd+0x2c7646b1)
    #2 0x560f5352c3ae  (/usr/bin/ceph-osd+0x374843ae)
    #3 0x560f535318ef  (/usr/bin/ceph-osd+0x374898ef)
    #4 0x560f536e395a  (/usr/bin/ceph-osd+0x3763b95a)
    #5 0x560f532413d9  (/usr/bin/ceph-osd+0x371993d9)
    #6 0x560f476af95a in main (/usr/bin/ceph-osd+0x2b60795a)
    #7 0x7f7aa0af97b2 in __libc_start_main (/lib64/libc.so.6+0x237b2)
    #8 0x560f477d2e8d in _start (/usr/bin/ceph-osd+0x2b72ae8d)

```

[1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-20_07:28:16-rados-master-distro-basic-smithi/6124735/

The commit deals with the problem by repacking the outer future.
An alternative could be in switching from `std::unique_ptr` to
`seastar::shared_ptr` for managing `OpsExecuter`.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Sage Weil [Fri, 7 May 2021 19:01:10 +0000 (15:01 -0400)]

qa/tasks/cephfs/test_nfs: fix info test

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Mon, 24 May 2021 15:16:45 +0000 (11:16 -0400)]

doc/cephfs/fs-nfs-exports: document --ingress --virtual-ip

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 18 May 2021 22:02:25 +0000 (18:02 -0400)]

mgr/nfs: move ingress vs virtual_ip check to cluster interface

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Fri, 7 May 2021 15:01:57 +0000 (11:01 -0400)]

PendingReleaseNotes: clarify deprecated

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Fri, 7 May 2021 14:58:45 +0000 (10:58 -0400)]

PendingReleaseNotes: note breaking CLI changes

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 6 May 2021 22:47:38 +0000 (18:47 -0400)]

doc/cephadm/nfs: document nfs+ingress

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Fri, 30 Apr 2021 15:37:51 +0000 (11:37 -0400)]

qa/suites/rados/cephadm/smoke-roleless: test nfs, nfs + ingress

Still missing a full client mount test, though!

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 6 May 2021 22:47:27 +0000 (18:47 -0400)]

mgr/nfs: take --ingress argument to 'nfs cluster create'

It is likely that the rook/k8s variation of ingress will not take a
virtual_ip argument. We want to make sure that ingress yes/no can be
specified independent of the virtual_ip.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 6 May 2021 18:37:14 +0000 (14:37 -0400)]

mgr/cephadm: adjust debug output for device refresh

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 6 May 2021 18:16:43 +0000 (14:16 -0400)]

mgr/cephadm: ingress: fix log msg

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 6 May 2021 18:16:38 +0000 (14:16 -0400)]

mgr/cephadm: fix logging of config/placement errors

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 6 May 2021 15:21:49 +0000 (11:21 -0400)]

common/options: enable nfs module for new clusters

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 6 May 2021 14:57:46 +0000 (10:57 -0400)]

cephadm: --stop-signal=SIGTERM

haproxy's container image tells docker|podman to send SIGUSR1 for a "clean"
shutdown.  For NFS, the connections never close, so we will always hit the
podman|docker 10s timeout and get a SIGKILL.  That, in turn, causes haproxy
to exit with 143, and puts the systemd unit in a failed state.

This highlights a general problem(?) with stopping containers: if they don't
do it quickly then we'll end up in this error state.  We don't directly
address that here.

Avoid this problem by always stopping containers with SIGTERM.  In the
haproxy case, that means an immediate shutdown (no graceful drain of
open connections).  In theory we could do this only for haproxy with
NFS, but we can easily imagine RGW connections that don't close in 10s
either, and we don't want containers exiting in error state--we just
want the proxy to stop quickly.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Mon, 3 May 2021 15:48:45 +0000 (11:48 -0400)]

mgr/orchestrator: default nfs pool, namespaces

Apply nfs default pool (currently 'nfs-ganesha'), and default the
namespace to the service_id.

There is no practical reason for users to ever need to change this, and
requiring them to provide this informaiton at config/apply time just
complicates life.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Mon, 3 May 2021 15:42:13 +0000 (11:42 -0400)]

mgr/cephadm: nfs: create pool if it doesn't yet exist

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 5 May 2021 16:26:28 +0000 (12:26 -0400)]

doc/cephadm/nfs: update

- leave off pool/ns, since they should almost never be necessary.
- add port

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 4 May 2021 17:10:14 +0000 (13:10 -0400)]

mgr/nfs: change 'nfs cluster info'

- include the virtual_ip and port at top level
- move backend server list into a sub-item
- include (haproxy) monitoring port

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 4 May 2021 17:09:38 +0000 (13:09 -0400)]

mgr/nfs: take optional virtual_ip for deploying ingress

For 'nfs cluster create', optionally take a virtual_ip to deploy ingress.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 5 May 2021 16:59:44 +0000 (12:59 -0400)]

mgr/nfs: remove 'nfs cluster update'

This command is very awkward to implement unless all service spec fields
are always required. That will soon mean both the placement *and*
virtual_ip (if any), making it much less useful for a human to make use
of.

Instead, let them update yaml, or adjust the nfs and/or ingress specs
directly. I don't think this command is needed.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Kefu Chai [Wed, 19 May 2021 14:35:36 +0000 (22:35 +0800)]

doc/_ext: render :example: field of an option

some options have this fields in their document, let's render it as
well.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Tue, 25 May 2021 12:21:55 +0000 (20:21 +0800)]

Merge pull request #41526 from rzarzynski/wip-crimson-drop-handle_failed_op

crimson/osd: drop the unused handle_failed_op() from PG.

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Tue, 25 May 2021 10:10:07 +0000 (10:10 +0000)]

crimson/osd: drop the unused handle_failed_op() from PG.

It became unused after the `InternalClientRequest` rework.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Ernesto Puerta [Tue, 25 May 2021 08:25:30 +0000 (10:25 +0200)]

Merge pull request #41447 from rhcs-dashboard/50909-fix-nfs-rgw-tenant-user

mgr/dashboard: show RGW tenant user id correctly in 'NFS create export' form

Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

commit | commitdiff | tree

Kefu Chai [Tue, 25 May 2021 08:09:56 +0000 (16:09 +0800)]

Merge pull request #41518 from cyx1231st/wip-seastore-onode-tree-fix-test

crimson/onode-staged-tree: fix an use-after-free issue in test

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Tue, 25 May 2021 08:03:42 +0000 (16:03 +0800)]

Merge pull request #41515 from tchaikov/wip-crimson-cleanup

crimson/osd: do not capture unused variable

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Kefu Chai [Tue, 25 May 2021 06:17:09 +0000 (14:17 +0800)]

crimson/osd: do not capture unused variable

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Alfonso Martínez [Thu, 20 May 2021 15:51:35 +0000 (17:51 +0200)]

mgr/dashboard: show RGW tenant user id correctly in 'NFS create export' form.

Fixes: https://tracker.ceph.com/issues/50909
Signed-off-by: Alfonso Martínez <almartin@redhat.com>

commit | commitdiff | tree

Yingxin Cheng [Tue, 25 May 2021 06:21:21 +0000 (14:21 +0800)]

crimson/onode-staged-tree: fix an use-after-free issue in test

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Kefu Chai [Tue, 25 May 2021 04:14:31 +0000 (12:14 +0800)]

Merge pull request #41500 from rzarzynski/wip-crison-opsequncer-assert-failure

crimson/osd: fix assertion failure in OpSequencer.

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Xuehan Xu <xuxuehan@360.cn>

commit | commitdiff | tree

Sunil Kumar Nagaraju [Tue, 25 May 2021 03:30:07 +0000 (09:00 +0530)]

Merge pull request #41414 from sunilkumarn417/rh_downstream

qa/tasks/cephadm.py: Include bootstrap registry options for downstream

commit | commitdiff | tree

Kefu Chai [Tue, 25 May 2021 01:31:49 +0000 (09:31 +0800)]

Merge pull request #41473 from tchaikov/wip-doc-mgr-influx

doc/mgr/influx: use :confval: directive

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

Kefu Chai [Tue, 25 May 2021 01:11:52 +0000 (09:11 +0800)]

Merge pull request #41512 from liu-chunmei/crimson-fix-build-error

crimson/seastore: fix build error.

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 21 May 2021 07:21:48 +0000 (15:21 +0800)]

doc/mgr/influx: use :confval: directive

less repeating this way

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Neha Ojha [Mon, 24 May 2021 21:44:18 +0000 (14:44 -0700)]

Merge pull request #41487 from neha-ojha/wip-toc

qa/suites/rados/thrash-old-clients: remove luminous and mimic and use centos_latest

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

chunmei-liu [Mon, 24 May 2021 21:20:19 +0000 (14:20 -0700)]

crimson/seastore: fix build error.

Signed-off-by: chunmei-liu <chunmei.liu@intel.com>

commit | commitdiff | tree

Sage Weil [Mon, 24 May 2021 21:03:33 +0000 (16:03 -0500)]

doc/cephfs/nfs: remove documented limitation

At the time NFS support was added, this limitation applied.
However, in
https://github.com/nfs-ganesha/nfs-ganesha/commit/b3d97f8157a131f55d848ff57e46af91b746b944
and
https://github.com/nfs-ganesha/nfs-ganesha/commit/1cfe7e2df96f9785367ba94d41559140f584a875
we added support for multiple filesystems and started mixing
the fscid into the filehandle.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Neha Ojha [Mon, 24 May 2021 19:53:46 +0000 (12:53 -0700)]

Merge pull request #41486 from neha-ojha/wip-49139-new

qa: use ubuntu_latest for perf suites and remove cosbench workloads

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Mon, 24 May 2021 19:21:00 +0000 (12:21 -0700)]

Merge pull request #41504 from yuriw/wip-yuriw-master

qa/tests - removed ref to 18.04 distro as it's not supported on master+

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Ernesto Puerta [Mon, 24 May 2021 18:39:53 +0000 (20:39 +0200)]

Merge pull request #41430 from rhcs-dashboard/fix-api-docs-link

mgr/dashboard: fix API docs link

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

commit | commitdiff | tree

Ernesto Puerta [Mon, 24 May 2021 18:37:25 +0000 (20:37 +0200)]

Merge pull request #41426 from rhcs-dashboard/drop-container-image-columns

mgr/dashboard: drop container image name and id from services list

Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

commit | commitdiff | tree

Neha Ojha [Mon, 24 May 2021 16:45:47 +0000 (16:45 +0000)]

qa/suites/rados/thrash-old-clients: use centos_latest.yaml

use centos_latest instead of bionic because this is only common
distro for which we build packages for nautilus and above.

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Mon, 24 May 2021 11:15:51 +0000 (11:15 +0000)]

crimson/osd: fix assertion failure in OpSequencer.

`OpSequencer` assumes that ID of a previous client request
is always lower than ID of current one. This is reflected
by the assertion in `OpSequencer::start_op()`. It triggered
the following failure [1] in Teuthology:

```
DEBUG 2021-05-07 08:01:41,227 [shard 0] osd - client_request(id=1, detail=osd_op(client.4171.0:1 2.2 2.7c339972 (undecoded) ondisk+retry+read+known_if_redirected e29) v8) same_interval_since: 31
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-3910-g1b18e076/rpm/el8/BUILD/ceph-
17.0.0-3910-g1b18e076/src/crimson/osd/osd_operation_sequencer.h:38: seastar::futurize_t<Result> crimson::osd::OpSequencer::start_op(HandleT&, uint64_t, uint64_t, FuncT&&) [with HandleT = crimson::PipelineHa
ndle; FuncT = crimson::interruptible::interruptor<InterruptCond>::wrap_function(Func&&) [with Func = crimson::osd::ClientRequest::start()::<lambda()> mutable::<lambda(Ref<crimson::osd::PG>)> mutable::<lambd
a()> mutable::<lambda()>; InterruptCond = crimson::osd::IOInterruptCondition]::<lambda()>; Result = crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, seastar::future<>
>; seastar::futurize_t<Result> = crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, seastar::future<> >; uint64_t = long unsigned int]: Assertion `prev_op < this_op' fai
led.
Aborting on shard 0.
Backtrace:
Segmentation fault.
Backtrace:
0# 0x00005592B028932F in ceph-osd
1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
3# 0x00007F57B72E7B20 in /lib64/libpthread.so.0
4# gsignal in /lib64/libc.so.6
5# abort in /lib64/libc.so.6
6# 0x00007F57B58E2B09 in /lib64/libc.so.6
7# 0x00007F57B58F0DE6 in /lib64/libc.so.6
8# 0x00005592ABB8484D in ceph-osd
9# 0x00005592ABB8ACB3 in ceph-osd
10# seastar::continuation<seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>, seastar::future<boost::intrusive_ptr<crimson::osd::PG> >::then_impl_nrvo<seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>, seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > >(seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>&&)::{lambda(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>&, seastar::future_state<boost::intrusive_ptr<crimson::osd::PG> >&&)#1}, boost::intrusive_ptr<crimson::osd::PG> >::run_and_dispose() in ceph-osd
11# 0x00005592B357F88F in ceph-osd
12# 0x00005592B3584DD0 in ceph-osd
```

[1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-07_07:41:02-rados-master-distro-basic-smithi/6104530

Crash analysis resulted in two observations:
1. during the request execution the acting set got
   changed, the request was interrupted and a try
   to re-execute it emerged;
2. the interrupted request was the very first client
   request the OSD has ever seen.

Code analysis showed a problem in how `ClientRequest`
establishes `prev_op_id`: although supposed to be performed
only once for a request, it can get executed twice but only
for the very first request `OpSequencer` saw.

```cpp
void ClientRequest::may_set_prev_op()
{
  // set prev_op_id if it's not set yet
  if (__builtin_expect(prev_op_id == 0, true)) {
    prev_op_id = sequencer.get_last_issued();
  }
}
```

Unfortunately, `0` isn't a distincted value that cannot
be returned by `get_last_issued()`:

```cpp
class OpSequencer {
  // ...

  uint64_t get_last_issued() const {
    return last_issued;
  }

  // ...

  // the id of last op which is issued
  uint64_t last_issued = 0;
```

As a result, `OpSequencer` returned on the second call
a new value (actually `this_op`) violating the assertion.
The commit fixes the problem by switching from a designated
value to `std::optional`.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Mon, 24 May 2021 17:40:23 +0000 (10:40 -0700)]

qa/tests - removed ref to 18.04 distro as it's not supported on master+

Signed-off-by: Yuri Weinstein <yweinste@redhat.com>

commit | commitdiff | tree

Sage Weil [Mon, 24 May 2021 17:30:27 +0000 (13:30 -0400)]

Merge PR #41451 into master

* refs/pull/41451/head:
qa/suites/rados: include rook test in rados

Reviewed-by: Yuri Weinstein <yweins@redhat.com>

commit | commitdiff | tree

Sage Weil [Fri, 21 May 2021 18:43:24 +0000 (14:43 -0400)]

mgr/cephadm: progress item for service apply

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Kefu Chai [Mon, 24 May 2021 09:45:53 +0000 (17:45 +0800)]

ceph.in: add $CEPH_PRELOAD_LIBCXX to LD_PRELOAD if specified

so use can start the vstart cluster like:

CEPH_PRELOAD_LIBCXX=/usr/lib/x86_64-linux-gnu/libstdc++.so.6 MDS=0 MGR=1 OSD=3 MON=1 ../src/vstart.sh -n -x

as a workaround of https://github.com/google/sanitizers/issues/934

Fixes: https://tracker.ceph.com/issues/50948
Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Avan Thakkar [Wed, 19 May 2021 23:57:29 +0000 (05:27 +0530)]

mgr/dashboard: fix API docs link

Fixes: https://tracker.ceph.com/issues/50890
Signed-off-by: Avan Thakkar <athakkar@redhat.com>

commit | commitdiff | tree

Kefu Chai [Mon, 24 May 2021 09:33:09 +0000 (17:33 +0800)]

ceph.in: compose execv_cmd in a single place

for better readability

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Mon, 24 May 2021 09:27:34 +0000 (17:27 +0800)]

ceph.in: put libasan.so path before other LD_PRELOAD paths

to ensure it is the first one to be preaload. to address following error
like:

==821517==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Mon, 24 May 2021 08:18:27 +0000 (16:18 +0800)]

cmake/modules/FindSanitizers: prefer libasan.6

libasan.6 is shipped as part of GCC-11, so prefer it over older versions

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Ilya Dryomov [Mon, 24 May 2021 08:14:38 +0000 (10:14 +0200)]

Merge pull request #41489 from onitopl/rbd_mirroring_doc

doc/rbd: add missing snapshot in command line examples

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

commit | commitdiff | tree

Kefu Chai [Mon, 24 May 2021 07:55:02 +0000 (15:55 +0800)]

Merge pull request #41492 from tchaikov/wip-mon_data_avail_crit-in-ceph.conf

vstart.sh: specify mon_data_avail_crit in ceph.conf

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

sunilkumarn417 [Wed, 19 May 2021 10:02:45 +0000 (15:32 +0530)]

qa/tasks/cephadm: Include bootstrap registry options for downstream
- registry-url, registry-username and registry-password bootstrap options are
supported now. This is needed to access monitoring service container images.
- usage of RHEL distribution based cephadm in download_cephadm task.

Signed-off-by: sunilkumarn417 <sunnagar@redhat.com>

commit | commitdiff | tree

Kefu Chai [Mon, 24 May 2021 03:51:09 +0000 (11:51 +0800)]

pybind/mgr/dashboard: drop "from __future__ import absolute_import"

since we've migrated to python3, there is no need to import
absolute_import for accessing absolute import anymore.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Mon, 24 May 2021 03:40:55 +0000 (11:40 +0800)]

pybind: do not 'import print_function'

since we've migrated to python3, there is no need to "import
print_function" for accessing "print()" anymore.

Signed-off-by: Kefu Chai <kchai@redhat.com>

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom