]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
4 years agodoc: add missing crush-device-class={device-class} pair for clay code profile 41543/head
luo.runbing [Wed, 26 May 2021 02:41:40 +0000 (10:41 +0800)]
doc: add missing crush-device-class={device-class} pair for clay code profile

`crush-device-class` is optional for `ceph osd erasure-code-profile set`,
add it for the sake of completeness

Signed-off-by: luo.runbing <luo.runbing@zte.com.cn>
4 years agoMerge pull request #41401 from rzarzynski/wip-crimson-injectdataerr
Kefu Chai [Wed, 26 May 2021 01:23:35 +0000 (09:23 +0800)]
Merge pull request #41401 from rzarzynski/wip-crimson-injectdataerr

crimson/osd, common: implement the inject{m,}dataerr admin commands

Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #41541 from liu-chunmei/seastore-add-devs
Kefu Chai [Wed, 26 May 2021 00:29:13 +0000 (08:29 +0800)]
Merge pull request #41541 from liu-chunmei/seastore-add-devs

crimson/seastore: add --seastore-devs in vstart.sh

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #41533 from tchaikov/wip-doc-rgw-conf
Kefu Chai [Wed, 26 May 2021 00:18:27 +0000 (08:18 +0800)]
Merge pull request #41533 from tchaikov/wip-doc-rgw-conf

doc/radosgw: use confval directive to define options

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
4 years agocrimson/osd: implement the injectmdataerr admin command. 41401/head
Radoslaw Zarzynski [Wed, 19 May 2021 12:38:22 +0000 (12:38 +0000)]
crimson/osd: implement the injectmdataerr admin command.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
4 years agocrimson/osd: implement the injectdataerr admin command.
Radoslaw Zarzynski [Wed, 19 May 2021 12:30:23 +0000 (12:30 +0000)]
crimson/osd: implement the injectdataerr admin command.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
4 years agocrimson/os: implement inject_{m,}data_error in AlienStore.
Radoslaw Zarzynski [Tue, 18 May 2021 14:38:50 +0000 (14:38 +0000)]
crimson/os: implement inject_{m,}data_error in AlienStore.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
4 years agoMerge pull request #41434 from tchaikov/wip-cmd-getval
Kefu Chai [Wed, 26 May 2021 00:03:46 +0000 (08:03 +0800)]
Merge pull request #41434 from tchaikov/wip-cmd-getval

common/cmdparse: use string_view for the key and return val by retval

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
4 years agocrimson/seastore: add --seastore-devs in vstart.sh 41541/head
chunmei-liu [Tue, 25 May 2021 20:34:00 +0000 (13:34 -0700)]
crimson/seastore: add --seastore-devs in vstart.sh

to support /dev/xxx as seastore device

Signed-off-by: chunmei-liu <chunmei.liu@intel.com>
4 years agoMerge PR #41007 into master
Sage Weil [Tue, 25 May 2021 20:17:44 +0000 (16:17 -0400)]
Merge PR #41007 into master

* refs/pull/41007/head:
qa/tasks/cephfs/test_nfs: fix info test
doc/cephfs/fs-nfs-exports: document --ingress --virtual-ip
mgr/nfs: move ingress vs virtual_ip check to cluster interface
PendingReleaseNotes: clarify deprecated
PendingReleaseNotes: note breaking CLI changes
doc/cephadm/nfs: document nfs+ingress
qa/suites/rados/cephadm/smoke-roleless: test nfs, nfs + ingress
mgr/nfs: take --ingress argument to 'nfs cluster create'
mgr/cephadm: adjust debug output for device refresh
mgr/cephadm: ingress: fix log msg
mgr/cephadm: fix logging of config/placement errors
common/options: enable nfs module for new clusters
cephadm: --stop-signal=SIGTERM
mgr/orchestrator: default nfs pool, namespaces
mgr/cephadm: nfs: create pool if it doesn't yet exist
doc/cephadm/nfs: update
mgr/nfs: change 'nfs cluster info'
mgr/nfs: take optional virtual_ip for deploying ingress
mgr/nfs: remove 'nfs cluster update'
mgr/nfs: factor out ganesha pool creation
mgr/nfs: delete -> rm for CLI
mgr/nfs: add some type annotations
python-common: fix IngressSpec yaml dump
mgr/cephadm: ingress: remove eth0 default
qa/tasks/cephadm: allow mounting volumes in shell
cephadm: add -v arg to shell
qa/tasks/vip: add 'vip.exec' task
mgr/orchestrator: add --port arg to 'orch apply nfs'
mgr/cephadm: nfs: add purge
mgr/cephadm: ingress: support nfs
mgr/cephadm: do not reconfigure daemons on deleted services
mgr/cephadm: nfs: shell out to rados tool for conf creation
mgr/cephadm: nfs: add rank to grace file from mgr module
mgr/cephadm: nfs: bind ganesha to appropriate ip:port
mgr/cephadm: enable ranked daemons for nfs
mgr/cephadm: support creation of daemons with ranks
mgr/cephadm: make _plan show removed daemon names
mgr/cephadm/schedule: assign/map ranks
mgr/cephadm: add rank[_generation] properties
mgr/cephadm/inventory: store optional rank_map along with specs
mgr/cephadm: include service_name is generated DaemonDescription
mgr/orchestrator: include service_name in DaemonDescription dump
mgr/cephadm/inventory: fix deleted check
mgr/cephadm: simplify
mgr/cephadm/schedule: make placement shuffle deterministic
mgr/cephadm: document CephadmService flags

Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Varsha Rao <varao@redhat.com>
4 years agoMerge PR #41539 into master
Sage Weil [Tue, 25 May 2021 20:17:21 +0000 (16:17 -0400)]
Merge PR #41539 into master

* refs/pull/41539/head:
doc/cephadm: fix prompts in service-management.rst

Reviewed-by: Sage Weil <sage@redhat.com>
4 years agodoc/cephadm: fix prompts in service-management.rst 41539/head
Zac Dover [Tue, 25 May 2021 19:22:56 +0000 (05:22 +1000)]
doc/cephadm: fix prompts in service-management.rst

This PR formats the prompts in service-managment.rst
properly.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
4 years agodoc/radosgw: use confval directive to define options 41533/head
Kefu Chai [Wed, 19 May 2021 14:36:07 +0000 (22:36 +0800)]
doc/radosgw: use confval directive to define options

less repeating this way

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agoqa/tasks/cephfs/test_nfs: fix info test 41007/head
Sage Weil [Fri, 7 May 2021 19:01:10 +0000 (15:01 -0400)]
qa/tasks/cephfs/test_nfs: fix info test

Signed-off-by: Sage Weil <sage@newdream.net>
4 years agodoc/cephfs/fs-nfs-exports: document --ingress --virtual-ip
Sage Weil [Mon, 24 May 2021 15:16:45 +0000 (11:16 -0400)]
doc/cephfs/fs-nfs-exports: document --ingress --virtual-ip

Signed-off-by: Sage Weil <sage@newdream.net>
4 years agomgr/nfs: move ingress vs virtual_ip check to cluster interface
Sage Weil [Tue, 18 May 2021 22:02:25 +0000 (18:02 -0400)]
mgr/nfs: move ingress vs virtual_ip check to cluster interface

Signed-off-by: Sage Weil <sage@newdream.net>
4 years agoPendingReleaseNotes: clarify deprecated
Sage Weil [Fri, 7 May 2021 15:01:57 +0000 (11:01 -0400)]
PendingReleaseNotes: clarify deprecated

Signed-off-by: Sage Weil <sage@newdream.net>
4 years agoPendingReleaseNotes: note breaking CLI changes
Sage Weil [Fri, 7 May 2021 14:58:45 +0000 (10:58 -0400)]
PendingReleaseNotes: note breaking CLI changes

Signed-off-by: Sage Weil <sage@newdream.net>
4 years agodoc/cephadm/nfs: document nfs+ingress
Sage Weil [Thu, 6 May 2021 22:47:38 +0000 (18:47 -0400)]
doc/cephadm/nfs: document nfs+ingress

Signed-off-by: Sage Weil <sage@newdream.net>
4 years agoqa/suites/rados/cephadm/smoke-roleless: test nfs, nfs + ingress
Sage Weil [Fri, 30 Apr 2021 15:37:51 +0000 (11:37 -0400)]
qa/suites/rados/cephadm/smoke-roleless: test nfs, nfs + ingress

Still missing a full client mount test, though!

Signed-off-by: Sage Weil <sage@newdream.net>
4 years agomgr/nfs: take --ingress argument to 'nfs cluster create'
Sage Weil [Thu, 6 May 2021 22:47:27 +0000 (18:47 -0400)]
mgr/nfs: take --ingress argument to 'nfs cluster create'

It is likely that the rook/k8s variation of ingress will not take a
virtual_ip argument.  We want to make sure that ingress yes/no can be
specified independent of the virtual_ip.

Signed-off-by: Sage Weil <sage@newdream.net>
4 years agomgr/cephadm: adjust debug output for device refresh
Sage Weil [Thu, 6 May 2021 18:37:14 +0000 (14:37 -0400)]
mgr/cephadm: adjust debug output for device refresh

Signed-off-by: Sage Weil <sage@newdream.net>
4 years agomgr/cephadm: ingress: fix log msg
Sage Weil [Thu, 6 May 2021 18:16:43 +0000 (14:16 -0400)]
mgr/cephadm: ingress: fix log msg

Signed-off-by: Sage Weil <sage@newdream.net>
4 years agomgr/cephadm: fix logging of config/placement errors
Sage Weil [Thu, 6 May 2021 18:16:38 +0000 (14:16 -0400)]
mgr/cephadm: fix logging of config/placement errors

Signed-off-by: Sage Weil <sage@newdream.net>
4 years agocommon/options: enable nfs module for new clusters
Sage Weil [Thu, 6 May 2021 15:21:49 +0000 (11:21 -0400)]
common/options: enable nfs module for new clusters

Signed-off-by: Sage Weil <sage@newdream.net>
4 years agocephadm: --stop-signal=SIGTERM
Sage Weil [Thu, 6 May 2021 14:57:46 +0000 (10:57 -0400)]
cephadm: --stop-signal=SIGTERM

haproxy's container image tells docker|podman to send SIGUSR1 for a "clean"
shutdown.  For NFS, the connections never close, so we will always hit the
podman|docker 10s timeout and get a SIGKILL.  That, in turn, causes haproxy
to exit with 143, and puts the systemd unit in a failed state.

This highlights a general problem(?) with stopping containers: if they don't
do it quickly then we'll end up in this error state.  We don't directly
address that here.

Avoid this problem by always stopping containers with SIGTERM.  In the
haproxy case, that means an immediate shutdown (no graceful drain of
open connections).  In theory we could do this only for haproxy with
NFS, but we can easily imagine RGW connections that don't close in 10s
either, and we don't want containers exiting in error state--we just
want the proxy to stop quickly.

Signed-off-by: Sage Weil <sage@newdream.net>
4 years agomgr/orchestrator: default nfs pool, namespaces
Sage Weil [Mon, 3 May 2021 15:48:45 +0000 (11:48 -0400)]
mgr/orchestrator: default nfs pool, namespaces

Apply nfs default pool (currently 'nfs-ganesha'), and default the
namespace to the service_id.

There is no practical reason for users to ever need to change this, and
requiring them to provide this informaiton at config/apply time just
complicates life.

Signed-off-by: Sage Weil <sage@newdream.net>
4 years agomgr/cephadm: nfs: create pool if it doesn't yet exist
Sage Weil [Mon, 3 May 2021 15:42:13 +0000 (11:42 -0400)]
mgr/cephadm: nfs: create pool if it doesn't yet exist

Signed-off-by: Sage Weil <sage@newdream.net>
4 years agodoc/cephadm/nfs: update
Sage Weil [Wed, 5 May 2021 16:26:28 +0000 (12:26 -0400)]
doc/cephadm/nfs: update

- leave off pool/ns, since they should almost never be necessary.
- add port

Signed-off-by: Sage Weil <sage@newdream.net>
4 years agomgr/nfs: change 'nfs cluster info'
Sage Weil [Tue, 4 May 2021 17:10:14 +0000 (13:10 -0400)]
mgr/nfs: change 'nfs cluster info'

- include the virtual_ip and port at top level
- move backend server list into a sub-item
- include (haproxy) monitoring port

Signed-off-by: Sage Weil <sage@newdream.net>
4 years agomgr/nfs: take optional virtual_ip for deploying ingress
Sage Weil [Tue, 4 May 2021 17:09:38 +0000 (13:09 -0400)]
mgr/nfs: take optional virtual_ip for deploying ingress

For 'nfs cluster create', optionally take a virtual_ip to deploy ingress.

Signed-off-by: Sage Weil <sage@newdream.net>
4 years agomgr/nfs: remove 'nfs cluster update'
Sage Weil [Wed, 5 May 2021 16:59:44 +0000 (12:59 -0400)]
mgr/nfs: remove 'nfs cluster update'

This command is very awkward to implement unless all service spec fields
are always required.  That will soon mean both the placement *and*
virtual_ip (if any), making it much less useful for a human to make use
of.

Instead, let them update yaml, or adjust the nfs and/or ingress specs
directly.  I don't think this command is needed.

Signed-off-by: Sage Weil <sage@newdream.net>
4 years agodoc/_ext: render :example: field of an option
Kefu Chai [Wed, 19 May 2021 14:35:36 +0000 (22:35 +0800)]
doc/_ext: render :example: field of an option

some options have this fields in their document, let's render it as
well.

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #41526 from rzarzynski/wip-crimson-drop-handle_failed_op
Kefu Chai [Tue, 25 May 2021 12:21:55 +0000 (20:21 +0800)]
Merge pull request #41526 from rzarzynski/wip-crimson-drop-handle_failed_op

crimson/osd: drop the unused handle_failed_op() from PG.

Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agocrimson/osd: drop the unused handle_failed_op() from PG. 41526/head
Radoslaw Zarzynski [Tue, 25 May 2021 10:10:07 +0000 (10:10 +0000)]
crimson/osd: drop the unused handle_failed_op() from PG.

It became unused after the `InternalClientRequest` rework.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
4 years agoMerge pull request #41447 from rhcs-dashboard/50909-fix-nfs-rgw-tenant-user
Ernesto Puerta [Tue, 25 May 2021 08:25:30 +0000 (10:25 +0200)]
Merge pull request #41447 from rhcs-dashboard/50909-fix-nfs-rgw-tenant-user

mgr/dashboard: show RGW tenant user id correctly in 'NFS create export' form

Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
4 years agoMerge pull request #41518 from cyx1231st/wip-seastore-onode-tree-fix-test
Kefu Chai [Tue, 25 May 2021 08:09:56 +0000 (16:09 +0800)]
Merge pull request #41518 from cyx1231st/wip-seastore-onode-tree-fix-test

crimson/onode-staged-tree: fix an use-after-free issue in test

Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #41515 from tchaikov/wip-crimson-cleanup
Kefu Chai [Tue, 25 May 2021 08:03:42 +0000 (16:03 +0800)]
Merge pull request #41515 from tchaikov/wip-crimson-cleanup

crimson/osd: do not capture unused variable

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
4 years agocrimson/osd: do not capture unused variable 41515/head
Kefu Chai [Tue, 25 May 2021 06:17:09 +0000 (14:17 +0800)]
crimson/osd: do not capture unused variable

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agomgr/dashboard: show RGW tenant user id correctly in 'NFS create export' form. 41447/head
Alfonso Martínez [Thu, 20 May 2021 15:51:35 +0000 (17:51 +0200)]
mgr/dashboard: show RGW tenant user id correctly in 'NFS create export' form.

Fixes: https://tracker.ceph.com/issues/50909
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
4 years agocrimson/onode-staged-tree: fix an use-after-free issue in test 41518/head
Yingxin Cheng [Tue, 25 May 2021 06:21:21 +0000 (14:21 +0800)]
crimson/onode-staged-tree: fix an use-after-free issue in test

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
4 years agoMerge pull request #41500 from rzarzynski/wip-crison-opsequncer-assert-failure
Kefu Chai [Tue, 25 May 2021 04:14:31 +0000 (12:14 +0800)]
Merge pull request #41500 from rzarzynski/wip-crison-opsequncer-assert-failure

crimson/osd: fix assertion failure in OpSequencer.

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Xuehan Xu <xuxuehan@360.cn>
4 years agoMerge pull request #41414 from sunilkumarn417/rh_downstream
Sunil Kumar Nagaraju [Tue, 25 May 2021 03:30:07 +0000 (09:00 +0530)]
Merge pull request #41414 from sunilkumarn417/rh_downstream

qa/tasks/cephadm.py: Include bootstrap registry options for downstream

4 years agoMerge pull request #41473 from tchaikov/wip-doc-mgr-influx
Kefu Chai [Tue, 25 May 2021 01:31:49 +0000 (09:31 +0800)]
Merge pull request #41473 from tchaikov/wip-doc-mgr-influx

doc/mgr/influx: use :confval: directive

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
4 years agoMerge pull request #41512 from liu-chunmei/crimson-fix-build-error
Kefu Chai [Tue, 25 May 2021 01:11:52 +0000 (09:11 +0800)]
Merge pull request #41512 from liu-chunmei/crimson-fix-build-error

crimson/seastore: fix build error.

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agodoc/mgr/influx: use :confval: directive 41473/head
Kefu Chai [Fri, 21 May 2021 07:21:48 +0000 (15:21 +0800)]
doc/mgr/influx: use :confval: directive

less repeating this way

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #41487 from neha-ojha/wip-toc
Neha Ojha [Mon, 24 May 2021 21:44:18 +0000 (14:44 -0700)]
Merge pull request #41487 from neha-ojha/wip-toc

qa/suites/rados/thrash-old-clients: remove luminous and mimic and use centos_latest

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
4 years agocrimson/seastore: fix build error. 41512/head
chunmei-liu [Mon, 24 May 2021 21:20:19 +0000 (14:20 -0700)]
crimson/seastore: fix build error.

Signed-off-by: chunmei-liu <chunmei.liu@intel.com>
4 years agoMerge pull request #41486 from neha-ojha/wip-49139-new
Neha Ojha [Mon, 24 May 2021 19:53:46 +0000 (12:53 -0700)]
Merge pull request #41486 from neha-ojha/wip-49139-new

qa: use ubuntu_latest for perf suites and remove cosbench workloads

Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #41504 from yuriw/wip-yuriw-master
Yuri Weinstein [Mon, 24 May 2021 19:21:00 +0000 (12:21 -0700)]
Merge pull request #41504 from yuriw/wip-yuriw-master

qa/tests - removed ref to 18.04 distro as it's not supported on master+

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
4 years agoMerge pull request #41430 from rhcs-dashboard/fix-api-docs-link
Ernesto Puerta [Mon, 24 May 2021 18:39:53 +0000 (20:39 +0200)]
Merge pull request #41430 from rhcs-dashboard/fix-api-docs-link

mgr/dashboard: fix API docs link

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
4 years agoMerge pull request #41426 from rhcs-dashboard/drop-container-image-columns
Ernesto Puerta [Mon, 24 May 2021 18:37:25 +0000 (20:37 +0200)]
Merge pull request #41426 from rhcs-dashboard/drop-container-image-columns

mgr/dashboard: drop container image name and id from services list

Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
4 years agoqa/suites/rados/thrash-old-clients: use centos_latest.yaml 41487/head
Neha Ojha [Mon, 24 May 2021 16:45:47 +0000 (16:45 +0000)]
qa/suites/rados/thrash-old-clients: use centos_latest.yaml

use centos_latest instead of bionic because this is only common
distro for which we build packages for nautilus and above.

Signed-off-by: Neha Ojha <nojha@redhat.com>
4 years agocrimson/osd: fix assertion failure in OpSequencer. 41500/head
Radoslaw Zarzynski [Mon, 24 May 2021 11:15:51 +0000 (11:15 +0000)]
crimson/osd: fix assertion failure in OpSequencer.

`OpSequencer` assumes that ID of a previous client request
is always lower than ID of current one. This is reflected
by the assertion in `OpSequencer::start_op()`. It triggered
the following failure [1] in Teuthology:

```
DEBUG 2021-05-07 08:01:41,227 [shard 0] osd - client_request(id=1, detail=osd_op(client.4171.0:1 2.2 2.7c339972 (undecoded) ondisk+retry+read+known_if_redirected e29) v8) same_interval_since: 31
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-3910-g1b18e076/rpm/el8/BUILD/ceph-
17.0.0-3910-g1b18e076/src/crimson/osd/osd_operation_sequencer.h:38: seastar::futurize_t<Result> crimson::osd::OpSequencer::start_op(HandleT&, uint64_t, uint64_t, FuncT&&) [with HandleT = crimson::PipelineHa
ndle; FuncT = crimson::interruptible::interruptor<InterruptCond>::wrap_function(Func&&) [with Func = crimson::osd::ClientRequest::start()::<lambda()> mutable::<lambda(Ref<crimson::osd::PG>)> mutable::<lambd
a()> mutable::<lambda()>; InterruptCond = crimson::osd::IOInterruptCondition]::<lambda()>; Result = crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, seastar::future<>
>; seastar::futurize_t<Result> = crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, seastar::future<> >; uint64_t = long unsigned int]: Assertion `prev_op < this_op' fai
led.
Aborting on shard 0.
Backtrace:
Segmentation fault.
Backtrace:
 0# 0x00005592B028932F in ceph-osd
 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
 2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
 3# 0x00007F57B72E7B20 in /lib64/libpthread.so.0
 4# gsignal in /lib64/libc.so.6
 5# abort in /lib64/libc.so.6
 6# 0x00007F57B58E2B09 in /lib64/libc.so.6
 7# 0x00007F57B58F0DE6 in /lib64/libc.so.6
 8# 0x00005592ABB8484D in ceph-osd
 9# 0x00005592ABB8ACB3 in ceph-osd
10# seastar::continuation<seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>, seastar::future<boost::intrusive_ptr<crimson::osd::PG> >::then_impl_nrvo<seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>, seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > >(seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>&&)::{lambda(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>&, seastar::future_state<boost::intrusive_ptr<crimson::osd::PG> >&&)#1}, boost::intrusive_ptr<crimson::osd::PG> >::run_and_dispose() in ceph-osd
11# 0x00005592B357F88F in ceph-osd
12# 0x00005592B3584DD0 in ceph-osd
```

[1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-07_07:41:02-rados-master-distro-basic-smithi/6104530

Crash analysis resulted in two observations:
1. during the request execution the acting set got
   changed, the request was interrupted and a try
   to re-execute it emerged;
2. the interrupted request was the very first client
   request the OSD has ever seen.

Code analysis showed a problem in how `ClientRequest`
establishes `prev_op_id`: although supposed to be performed
only once for a request, it can get executed twice but only
for the very first request `OpSequencer` saw.

```cpp
void ClientRequest::may_set_prev_op()
{
  // set prev_op_id if it's not set yet
  if (__builtin_expect(prev_op_id == 0, true)) {
    prev_op_id = sequencer.get_last_issued();
  }
}
```

Unfortunately, `0` isn't a distincted value that cannot
be returned by `get_last_issued()`:

```cpp
class OpSequencer {
  // ...

  uint64_t get_last_issued() const {
    return last_issued;
  }

  // ...

  // the id of last op which is issued
  uint64_t last_issued = 0;
```

As a result, `OpSequencer` returned on the second call
a new value (actually `this_op`) violating the assertion.
The commit fixes the problem by switching from a designated
value to `std::optional`.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
4 years agoqa/tests - removed ref to 18.04 distro as it's not supported on master+ 41504/head
Yuri Weinstein [Mon, 24 May 2021 17:40:23 +0000 (10:40 -0700)]
qa/tests - removed ref to 18.04 distro as it's not supported on master+

Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
4 years agoMerge PR #41451 into master
Sage Weil [Mon, 24 May 2021 17:30:27 +0000 (13:30 -0400)]
Merge PR #41451 into master

* refs/pull/41451/head:
qa/suites/rados: include rook test in rados

Reviewed-by: Yuri Weinstein <yweins@redhat.com>
4 years agomgr/dashboard: fix API docs link 41430/head
Avan Thakkar [Wed, 19 May 2021 23:57:29 +0000 (05:27 +0530)]
mgr/dashboard: fix API docs link

Fixes: https://tracker.ceph.com/issues/50890
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
4 years agoMerge pull request #41489 from onitopl/rbd_mirroring_doc
Ilya Dryomov [Mon, 24 May 2021 08:14:38 +0000 (10:14 +0200)]
Merge pull request #41489 from onitopl/rbd_mirroring_doc

doc/rbd: add missing snapshot in command line examples

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
4 years agoMerge pull request #41492 from tchaikov/wip-mon_data_avail_crit-in-ceph.conf
Kefu Chai [Mon, 24 May 2021 07:55:02 +0000 (15:55 +0800)]
Merge pull request #41492 from tchaikov/wip-mon_data_avail_crit-in-ceph.conf

vstart.sh: specify mon_data_avail_crit in ceph.conf

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
4 years agoqa/tasks/cephadm: Include bootstrap registry options for downstream 41414/head
sunilkumarn417 [Wed, 19 May 2021 10:02:45 +0000 (15:32 +0530)]
qa/tasks/cephadm: Include bootstrap registry options for downstream
- registry-url, registry-username and registry-password bootstrap options are
supported now. This is needed to access monitoring service container images.
- usage of RHEL distribution based cephadm in download_cephadm task.

Signed-off-by: sunilkumarn417 <sunnagar@redhat.com>
4 years agoMerge pull request #41363 from Aran85/crimson-fix-syntax
Kefu Chai [Mon, 24 May 2021 02:45:33 +0000 (10:45 +0800)]
Merge pull request #41363 from Aran85/crimson-fix-syntax

crimson/seastore: remove unused method

Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agovstart.sh: specify mon_data_avail_crit in ceph.conf 41492/head
Kefu Chai [Mon, 24 May 2021 02:21:52 +0000 (10:21 +0800)]
vstart.sh: specify mon_data_avail_crit in ceph.conf

ceph-mon consumes this option when it boots, and exits if the ratio
of free space is  lower than the specified number, which is 5% by
default. but we use `ceph -c $conf_fn config assimilate-conf -i -`
to absorb these option after monitor starts. so, without this change,
the default value of mon_data_avail_crit is always used, if machine
has lower ratio of free space on the partition where mon store is
located, ceph-mon just exists with the error message like:

2021-05-24T01:53:14.644+0000 7ff64961e580 -1 error: monitor data
filesystem reached concerning levels of available storage space
(available: 4% 17 GiB)

after this change, the option is written in ceph.conf, and can be
read by ceph-mon when it boots. so the overriden value of 1% has
the chance to take effect. this helps to address some test failures
found in our "make check" runs performed by jenkins on machines whose
disk space is enough for completing the test, but its ratio of free
space is lower than 5%.

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agocrimson/seastore: remove unused method 41363/head
Aran85 [Mon, 17 May 2021 12:57:38 +0000 (20:57 +0800)]
crimson/seastore: remove unused method

Signed-off-by: Zengran Zhang <zhangzengran@sangfor.com.cn>
4 years agodoc/rbd: add missing snapshot in command line examples 41489/head
Grzegorz Wieczorek [Sat, 22 May 2021 15:19:04 +0000 (17:19 +0200)]
doc/rbd: add missing snapshot in command line examples

Signed-off-by: Grzegorz Wieczorek <grzegorz.wieczorek@onito.pl>
4 years agoMerge pull request #41480 from MrFreezeex/fix-segfault-replayer-snapshot-shutdown
Ilya Dryomov [Sun, 23 May 2021 14:32:19 +0000 (16:32 +0200)]
Merge pull request #41480 from MrFreezeex/fix-segfault-replayer-snapshot-shutdown

rbd-mirror: fix segfault in snapshot replayer shutdown

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
4 years agoMerge pull request #41433 from tchaikov/wip-50891
Kefu Chai [Sun, 23 May 2021 00:43:07 +0000 (08:43 +0800)]
Merge pull request #41433 from tchaikov/wip-50891

os/bluestore/bluestore_tool: compare retval stat() with -1

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
4 years agoMerge pull request #41429 from ifed01/wip-ifed-fix-repair-multithreading
Kefu Chai [Sun, 23 May 2021 00:42:03 +0000 (08:42 +0800)]
Merge pull request #41429 from ifed01/wip-ifed-fix-repair-multithreading

os/bluestore: introduce multithireading sync for bluestore's repairer

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
4 years agoMerge pull request #41436 from runsisi/wip-fix-unit
Kefu Chai [Sun, 23 May 2021 00:39:33 +0000 (08:39 +0800)]
Merge pull request #41436 from runsisi/wip-fix-unit

common/options: fix option type for bluestore_block_db_size

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #41466 from ansiwen/install-cepfs-headers
Kefu Chai [Sun, 23 May 2021 00:38:25 +0000 (08:38 +0800)]
Merge pull request #41466 from ansiwen/install-cepfs-headers

include/cephfs: add cephfs headers to CMakeLists.txt

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agocommon: introduce std::optional-returning variants of cmd_getval(). 41434/head
Radoslaw Zarzynski [Wed, 19 May 2021 12:28:31 +0000 (12:28 +0000)]
common: introduce std::optional-returning variants of cmd_getval().

Using an output paramter instead of returning is confusing but
common in pre-C++11 code. Let's modernize `cmd_getval()`.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
4 years agocommon/cmdparse: return cmd option using return value
Kefu Chai [Thu, 20 May 2021 03:00:06 +0000 (11:00 +0800)]
common/cmdparse: return cmd option using return value

instead of

- always returning "true"
- returning using an input parameter

just return the value with retval. simpler this way.

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agocommon/cmdparse: use map::find() only a single time
Kefu Chai [Thu, 20 May 2021 02:27:37 +0000 (10:27 +0800)]
common/cmdparse: use map::find() only a single time

instead of using the combo of

if (map.count(key)) {
  return map.find(key)->second;
}

just use

found = map.find(key);
if (found != map.end()) {
  return found->second;
}

to avoid repeating the lookup in the map with the same key.

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agocommon/cmdparse: use string_view for the key
Kefu Chai [Thu, 20 May 2021 02:20:50 +0000 (10:20 +0800)]
common/cmdparse: use string_view for the key

for better usability and performance. as the main use case of
cmd_getval() and cmd_putval() only uses a literal string for the key.
it's a waste to build a std::string out of it and throw it away after
looking the cmdmap with it.

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #40910 from galsalomon66/update_s3select_submodule_and_s3tests_hea...
Matt Benjamin [Fri, 21 May 2021 23:17:53 +0000 (19:17 -0400)]
Merge pull request #40910 from galsalomon66/update_s3select_submodule_and_s3tests_head_10apr

rgw/s3select: update s3select submodule to last commit, (new features), update for test coverage(s3test)

4 years agoMerge PR #41479 into master
Sage Weil [Fri, 21 May 2021 22:22:30 +0000 (18:22 -0400)]
Merge PR #41479 into master

* refs/pull/41479/head:
qa/tasks/cephadm.conf: log_to_journald=false

Reviewed-by: Sebastian Wagner <swagner@suse.com>
4 years agoqa/suites/rados/thrash-old-clients: remove luminous and mimic
Neha Ojha [Fri, 21 May 2021 21:38:24 +0000 (21:38 +0000)]
qa/suites/rados/thrash-old-clients: remove luminous and mimic

We support N-3 client versions.

Signed-off-by: Neha Ojha <nojha@redhat.com>
4 years agoqa: remove cosbench workloads from perf suites 41486/head
Neha Ojha [Fri, 21 May 2021 20:17:11 +0000 (20:17 +0000)]
qa: remove cosbench workloads from perf suites

Due to https://tracker.ceph.com/issues/49139

Signed-off-by: Neha Ojha <nojha@redhat.com>
4 years agoMerge pull request #41472 from cyx1231st/wip-seastore-onode-tree-errorhandling
Samuel Just [Fri, 21 May 2021 19:26:53 +0000 (12:26 -0700)]
Merge pull request #41472 from cyx1231st/wip-seastore-onode-tree-errorhandling

crimson/onode-staged-tree: tolerate eagain and add proper errorhandling

Reviewed-by: Samuel Just <sjust@redhat.com>
4 years agoupdate to s3select/master (new features) 40910/head
gal salomon [Mon, 19 Apr 2021 11:54:15 +0000 (14:54 +0300)]
update to s3select/master (new features)

Signed-off-by: gal salomon <gal.salomon@gmail.com>
force-branch to s3test/master

Signed-off-by: gal salomon <gal.salomon@gmail.com>
update of s3select (== -> =)

Signed-off-by: gal salomon <gal.salomon@gmail.com>
update of s3select (== -> =)

Signed-off-by: gal salomon <gal.salomon@gmail.com>
closing documentation gaps between previous and current functionalities(WIP)

Signed-off-by: gal salomon <gal.salomon@gmail.com>
Update s3select.rst

additional features

Update s3select.rst

Signed-off-by: gal salomon <gal.salomon@gmail.com>
Update s3select.rst

editorial
Signed-off-by: gal salomon <gal.salomon@gmail.com>
Update s3select.rst

editorial
Signed-off-by: gal salomon <gal.salomon@gmail.com>
Update s3select.rst

adding cast to bool
Signed-off-by: gal salomon <gal.salomon@gmail.com>
editorial; more description for substring and trim

Signed-off-by: gal salomon <gal.salomon@gmail.com>
4 years agoMerge PR #41441 into master
Patrick Donnelly [Fri, 21 May 2021 18:16:36 +0000 (11:16 -0700)]
Merge PR #41441 into master

* refs/pull/41441/head:
.github/labeler: add nfs label

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Sebastian Wagner <swagner@suse.com>
4 years agoqa: use ubuntu_latest for perf suites
Neha Ojha [Fri, 21 May 2021 16:15:37 +0000 (16:15 +0000)]
qa: use ubuntu_latest for perf suites

Signed-off-by: Neha Ojha <nojha@redhat.com>
4 years agorbd-mirror: fix segfault in snapshot replayer shutdown 41480/head
Arthur Outhenin-Chalandre [Fri, 21 May 2021 15:05:24 +0000 (17:05 +0200)]
rbd-mirror: fix segfault in snapshot replayer shutdown

If an error arises in the init flow of the snapshot replayer and the
function returns before the call on `register_local_update_watcher`
the value of `m_update_watch_ctx` will not be initialized. Therefore,
on the shutdown phase, the replayer will try to free this pointer
and segfault.

This commit fixes this issue by setting `m_update_watch_ctx` to
`nullptr`.

Fixes: https://tracker.ceph.com/issues/50931
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
4 years agoMerge pull request #41464 from jdurgin/wip-bib
Yuri Weinstein [Fri, 21 May 2021 15:34:21 +0000 (08:34 -0700)]
Merge pull request #41464 from jdurgin/wip-bib

script/build-integration-branch: always generate merge commits

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
4 years agoMerge pull request #41476 from tchaikov/wip-crimson-options
Kefu Chai [Fri, 21 May 2021 14:26:10 +0000 (22:26 +0800)]
Merge pull request #41476 from tchaikov/wip-crimson-options

crimson/osd: disable allow_guessing when parsing command line options

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
4 years agoqa/tasks/cephadm.conf: log_to_journald=false 41479/head
Sage Weil [Fri, 21 May 2021 13:51:47 +0000 (09:51 -0400)]
qa/tasks/cephadm.conf: log_to_journald=false

For teuthology runs, we set log_to_stderr=false, so that we only see
derr-level events in the container log (and teuthology.log).  Now that we
log directly to journald, set log_to_journald=false too, so that we don't
see level-20 logs in teuthology.log.

Signed-off-by: Sage Weil <sage@newdream.net>
4 years agoMerge pull request #41474 from rhcs-dashboard/fix-50918-master
Ernesto Puerta [Fri, 21 May 2021 13:31:45 +0000 (15:31 +0200)]
Merge pull request #41474 from rhcs-dashboard/fix-50918-master

mgr/dashboard: remove non-null id in Grafana dashboard

Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agocrimson/osd: disable allow_guessing when parsing command line options 41476/head
Kefu Chai [Fri, 21 May 2021 12:10:38 +0000 (20:10 +0800)]
crimson/osd: disable allow_guessing when parsing command line options

we pass "--id <n>" to ceph-osd for specifying the osd id, but seastar
app template also provides an option of "--idle-poll-time-us arg".
boost::program_option::command_line_parser() uses default_style when
parsing options. and default_style includes allow_guessing, which in
turn matches partial option as well, so "--id" matches with "--idle"
when we are trying to figure out which options are consumed by seastar
app template, and which are not. see
https://www.boost.org/doc/libs/1_76_0/doc/html/boost/program_options/command_line_style/style_t.html

so, in this change, stype is specified explicitly, and "allow_guessing"
is removed from the "default_style" before being passed to style(), so
that only the full option name are matched.

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agomgr/dashboard: remove non-null id in Grafana dashb 41474/head
Ernesto Puerta [Fri, 21 May 2021 08:57:23 +0000 (10:57 +0200)]
mgr/dashboard: remove non-null id in Grafana dashb

Testing added to prevent this situation.

Fixes: https://tracker.ceph.com/issues/50918
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
4 years agoos/bluestore/bluestore_tool: use std::filesystem 41433/head
Kefu Chai [Thu, 20 May 2021 05:46:59 +0000 (13:46 +0800)]
os/bluestore/bluestore_tool: use std::filesystem

for better readability

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agoos/bluestore/bluestore_tool: use boost::filesystem as an alternative
Kefu Chai [Fri, 21 May 2021 04:10:50 +0000 (04:10 +0000)]
os/bluestore/bluestore_tool: use boost::filesystem as an alternative

the libstdc++ shipped with GCC 7.5 does not have good support of
std::filesystem, among other things, it does not offer
std::filesystem::weakly_canonical(). but boost::filesystem does.
and boost::filesystem is compatible with std::filesystem to some
degree. so let's use it if <filesystem> is not available, we can
take it as a signal that std::filesystem is not quite ready yet.

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agocrimson/onode-staged-tree: detect errors from seastore backend 41472/head
Yingxin Cheng [Fri, 21 May 2021 06:42:50 +0000 (14:42 +0800)]
crimson/onode-staged-tree: detect errors from seastore backend

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
4 years agocrimson/onode-staged-tree: add asserts
Yingxin Cheng [Fri, 21 May 2021 06:41:33 +0000 (14:41 +0800)]
crimson/onode-staged-tree: add asserts

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
4 years agocrimson/onode-staged-tree: tolerate eagain during fix_index()
Yingxin Cheng [Fri, 21 May 2021 06:37:40 +0000 (14:37 +0800)]
crimson/onode-staged-tree: tolerate eagain during fix_index()

Fix the one-directional link during fix_index() which causes errors
during node destruction under interruptive eagain.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
4 years agocrimson/onode-staged-tree: cleanup Node tracking logic for eagain
Yingxin Cheng [Thu, 20 May 2021 07:27:19 +0000 (15:27 +0800)]
crimson/onode-staged-tree: cleanup Node tracking logic for eagain

Introduce deref_super/parent() to make sure the bi-directional links
are reset together to survive eagain.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
4 years agocrimson/onode-staged-tree: reduce unit test efforts
Yingxin Cheng [Thu, 20 May 2021 03:15:49 +0000 (11:15 +0800)]
crimson/onode-staged-tree: reduce unit test efforts

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
4 years agocrimson/onode-staged-tree: validate insert/lookup/erase with eagain
Yingxin Cheng [Thu, 20 May 2021 03:14:31 +0000 (11:14 +0800)]
crimson/onode-staged-tree: validate insert/lookup/erase with eagain

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
4 years agocrimson/onode-staged-tree: fix initialization in perf tool
Yingxin Cheng [Thu, 20 May 2021 01:46:17 +0000 (09:46 +0800)]
crimson/onode-staged-tree: fix initialization in perf tool

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
4 years agocrimson/onode-staged-tree: free resources when call submit_transaction()
Yingxin Cheng [Wed, 19 May 2021 08:26:10 +0000 (16:26 +0800)]
crimson/onode-staged-tree: free resources when call submit_transaction()

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
4 years agocrimson/onode-staged-tree: tolerate eagain between extent allocation and node initial...
Yingxin Cheng [Wed, 19 May 2021 08:23:02 +0000 (16:23 +0800)]
crimson/onode-staged-tree: tolerate eagain between extent allocation and node initialization

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
4 years agocrimson/onode-staged-tree: distinguish extent state between retired and invalid
Yingxin Cheng [Wed, 19 May 2021 07:36:53 +0000 (15:36 +0800)]
crimson/onode-staged-tree: distinguish extent state between retired and invalid

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>