git-server-git.apps.pok.os.sepia.ceph.com Git

]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/log

luo.runbing [Wed, 26 May 2021 02:41:40 +0000 (10:41 +0800)]

doc: add missing crush-device-class={device-class} pair for clay code profile

`crush-device-class` is optional for `ceph osd erasure-code-profile set`,
add it for the sake of completeness

Signed-off-by: luo.runbing <luo.runbing@zte.com.cn>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 01:23:35 +0000 (09:23 +0800)]

Merge pull request #41401 from rzarzynski/wip-crimson-injectdataerr

crimson/osd, common: implement the inject{m,}dataerr admin commands

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 00:29:13 +0000 (08:29 +0800)]

Merge pull request #41541 from liu-chunmei/seastore-add-devs

crimson/seastore: add --seastore-devs in vstart.sh

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 00:18:27 +0000 (08:18 +0800)]

Merge pull request #41533 from tchaikov/wip-doc-rgw-conf

doc/radosgw: use confval directive to define options

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Wed, 19 May 2021 12:38:22 +0000 (12:38 +0000)]

crimson/osd: implement the injectmdataerr admin command.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Wed, 19 May 2021 12:30:23 +0000 (12:30 +0000)]

crimson/osd: implement the injectdataerr admin command.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Tue, 18 May 2021 14:38:50 +0000 (14:38 +0000)]

crimson/os: implement inject_{m,}data_error in AlienStore.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 26 May 2021 00:03:46 +0000 (08:03 +0800)]

Merge pull request #41434 from tchaikov/wip-cmd-getval

common/cmdparse: use string_view for the key and return val by retval

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

chunmei-liu [Tue, 25 May 2021 20:34:00 +0000 (13:34 -0700)]

crimson/seastore: add --seastore-devs in vstart.sh

to support /dev/xxx as seastore device

Signed-off-by: chunmei-liu <chunmei.liu@intel.com>

commit | commitdiff | tree

Sage Weil [Tue, 25 May 2021 20:17:44 +0000 (16:17 -0400)]

Merge PR #41007 into master

* refs/pull/41007/head:
qa/tasks/cephfs/test_nfs: fix info test
doc/cephfs/fs-nfs-exports: document --ingress --virtual-ip
mgr/nfs: move ingress vs virtual_ip check to cluster interface
PendingReleaseNotes: clarify deprecated
PendingReleaseNotes: note breaking CLI changes
doc/cephadm/nfs: document nfs+ingress
qa/suites/rados/cephadm/smoke-roleless: test nfs, nfs + ingress
mgr/nfs: take --ingress argument to 'nfs cluster create'
mgr/cephadm: adjust debug output for device refresh
mgr/cephadm: ingress: fix log msg
mgr/cephadm: fix logging of config/placement errors
common/options: enable nfs module for new clusters
cephadm: --stop-signal=SIGTERM
mgr/orchestrator: default nfs pool, namespaces
mgr/cephadm: nfs: create pool if it doesn't yet exist
doc/cephadm/nfs: update
mgr/nfs: change 'nfs cluster info'
mgr/nfs: take optional virtual_ip for deploying ingress
mgr/nfs: remove 'nfs cluster update'
mgr/nfs: factor out ganesha pool creation
mgr/nfs: delete -> rm for CLI
mgr/nfs: add some type annotations
python-common: fix IngressSpec yaml dump
mgr/cephadm: ingress: remove eth0 default
qa/tasks/cephadm: allow mounting volumes in shell
cephadm: add -v arg to shell
qa/tasks/vip: add 'vip.exec' task
mgr/orchestrator: add --port arg to 'orch apply nfs'
mgr/cephadm: nfs: add purge
mgr/cephadm: ingress: support nfs
mgr/cephadm: do not reconfigure daemons on deleted services
mgr/cephadm: nfs: shell out to rados tool for conf creation
mgr/cephadm: nfs: add rank to grace file from mgr module
mgr/cephadm: nfs: bind ganesha to appropriate ip:port
mgr/cephadm: enable ranked daemons for nfs
mgr/cephadm: support creation of daemons with ranks
mgr/cephadm: make _plan show removed daemon names
mgr/cephadm/schedule: assign/map ranks
mgr/cephadm: add rank[_generation] properties
mgr/cephadm/inventory: store optional rank_map along with specs
mgr/cephadm: include service_name is generated DaemonDescription
mgr/orchestrator: include service_name in DaemonDescription dump
mgr/cephadm/inventory: fix deleted check
mgr/cephadm: simplify
mgr/cephadm/schedule: make placement shuffle deterministic
mgr/cephadm: document CephadmService flags

Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Varsha Rao <varao@redhat.com>

commit | commitdiff | tree

Sage Weil [Tue, 25 May 2021 20:17:21 +0000 (16:17 -0400)]

Merge PR #41539 into master

* refs/pull/41539/head:
doc/cephadm: fix prompts in service-management.rst

Reviewed-by: Sage Weil <sage@redhat.com>

commit | commitdiff | tree

Zac Dover [Tue, 25 May 2021 19:22:56 +0000 (05:22 +1000)]

doc/cephadm: fix prompts in service-management.rst

This PR formats the prompts in service-managment.rst
properly.

Signed-off-by: Zac Dover <zac.dover@gmail.com>

commit | commitdiff | tree

Kefu Chai [Wed, 19 May 2021 14:36:07 +0000 (22:36 +0800)]

doc/radosgw: use confval directive to define options

less repeating this way

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Sage Weil [Fri, 7 May 2021 19:01:10 +0000 (15:01 -0400)]

qa/tasks/cephfs/test_nfs: fix info test

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Mon, 24 May 2021 15:16:45 +0000 (11:16 -0400)]

doc/cephfs/fs-nfs-exports: document --ingress --virtual-ip

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 18 May 2021 22:02:25 +0000 (18:02 -0400)]

mgr/nfs: move ingress vs virtual_ip check to cluster interface

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Fri, 7 May 2021 15:01:57 +0000 (11:01 -0400)]

PendingReleaseNotes: clarify deprecated

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Fri, 7 May 2021 14:58:45 +0000 (10:58 -0400)]

PendingReleaseNotes: note breaking CLI changes

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 6 May 2021 22:47:38 +0000 (18:47 -0400)]

doc/cephadm/nfs: document nfs+ingress

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Fri, 30 Apr 2021 15:37:51 +0000 (11:37 -0400)]

qa/suites/rados/cephadm/smoke-roleless: test nfs, nfs + ingress

Still missing a full client mount test, though!

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 6 May 2021 22:47:27 +0000 (18:47 -0400)]

mgr/nfs: take --ingress argument to 'nfs cluster create'

It is likely that the rook/k8s variation of ingress will not take a
virtual_ip argument. We want to make sure that ingress yes/no can be
specified independent of the virtual_ip.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 6 May 2021 18:37:14 +0000 (14:37 -0400)]

mgr/cephadm: adjust debug output for device refresh

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 6 May 2021 18:16:43 +0000 (14:16 -0400)]

mgr/cephadm: ingress: fix log msg

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 6 May 2021 18:16:38 +0000 (14:16 -0400)]

mgr/cephadm: fix logging of config/placement errors

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 6 May 2021 15:21:49 +0000 (11:21 -0400)]

common/options: enable nfs module for new clusters

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 6 May 2021 14:57:46 +0000 (10:57 -0400)]

cephadm: --stop-signal=SIGTERM

haproxy's container image tells docker|podman to send SIGUSR1 for a "clean"
shutdown.  For NFS, the connections never close, so we will always hit the
podman|docker 10s timeout and get a SIGKILL.  That, in turn, causes haproxy
to exit with 143, and puts the systemd unit in a failed state.

This highlights a general problem(?) with stopping containers: if they don't
do it quickly then we'll end up in this error state.  We don't directly
address that here.

Avoid this problem by always stopping containers with SIGTERM.  In the
haproxy case, that means an immediate shutdown (no graceful drain of
open connections).  In theory we could do this only for haproxy with
NFS, but we can easily imagine RGW connections that don't close in 10s
either, and we don't want containers exiting in error state--we just
want the proxy to stop quickly.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Mon, 3 May 2021 15:48:45 +0000 (11:48 -0400)]

mgr/orchestrator: default nfs pool, namespaces

Apply nfs default pool (currently 'nfs-ganesha'), and default the
namespace to the service_id.

There is no practical reason for users to ever need to change this, and
requiring them to provide this informaiton at config/apply time just
complicates life.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Mon, 3 May 2021 15:42:13 +0000 (11:42 -0400)]

mgr/cephadm: nfs: create pool if it doesn't yet exist

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 5 May 2021 16:26:28 +0000 (12:26 -0400)]

doc/cephadm/nfs: update

- leave off pool/ns, since they should almost never be necessary.
- add port

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 4 May 2021 17:10:14 +0000 (13:10 -0400)]

mgr/nfs: change 'nfs cluster info'

- include the virtual_ip and port at top level
- move backend server list into a sub-item
- include (haproxy) monitoring port

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 4 May 2021 17:09:38 +0000 (13:09 -0400)]

mgr/nfs: take optional virtual_ip for deploying ingress

For 'nfs cluster create', optionally take a virtual_ip to deploy ingress.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 5 May 2021 16:59:44 +0000 (12:59 -0400)]

mgr/nfs: remove 'nfs cluster update'

This command is very awkward to implement unless all service spec fields
are always required. That will soon mean both the placement *and*
virtual_ip (if any), making it much less useful for a human to make use
of.

Instead, let them update yaml, or adjust the nfs and/or ingress specs
directly. I don't think this command is needed.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Kefu Chai [Wed, 19 May 2021 14:35:36 +0000 (22:35 +0800)]

doc/_ext: render :example: field of an option

some options have this fields in their document, let's render it as
well.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Tue, 25 May 2021 12:21:55 +0000 (20:21 +0800)]

Merge pull request #41526 from rzarzynski/wip-crimson-drop-handle_failed_op

crimson/osd: drop the unused handle_failed_op() from PG.

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Tue, 25 May 2021 10:10:07 +0000 (10:10 +0000)]

crimson/osd: drop the unused handle_failed_op() from PG.

It became unused after the `InternalClientRequest` rework.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Ernesto Puerta [Tue, 25 May 2021 08:25:30 +0000 (10:25 +0200)]

Merge pull request #41447 from rhcs-dashboard/50909-fix-nfs-rgw-tenant-user

mgr/dashboard: show RGW tenant user id correctly in 'NFS create export' form

Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

commit | commitdiff | tree

Kefu Chai [Tue, 25 May 2021 08:09:56 +0000 (16:09 +0800)]

Merge pull request #41518 from cyx1231st/wip-seastore-onode-tree-fix-test

crimson/onode-staged-tree: fix an use-after-free issue in test

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Tue, 25 May 2021 08:03:42 +0000 (16:03 +0800)]

Merge pull request #41515 from tchaikov/wip-crimson-cleanup

crimson/osd: do not capture unused variable

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Kefu Chai [Tue, 25 May 2021 06:17:09 +0000 (14:17 +0800)]

crimson/osd: do not capture unused variable

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Alfonso Martínez [Thu, 20 May 2021 15:51:35 +0000 (17:51 +0200)]

mgr/dashboard: show RGW tenant user id correctly in 'NFS create export' form.

Fixes: https://tracker.ceph.com/issues/50909
Signed-off-by: Alfonso Martínez <almartin@redhat.com>

commit | commitdiff | tree

Yingxin Cheng [Tue, 25 May 2021 06:21:21 +0000 (14:21 +0800)]

crimson/onode-staged-tree: fix an use-after-free issue in test

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Kefu Chai [Tue, 25 May 2021 04:14:31 +0000 (12:14 +0800)]

Merge pull request #41500 from rzarzynski/wip-crison-opsequncer-assert-failure

crimson/osd: fix assertion failure in OpSequencer.

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Xuehan Xu <xuxuehan@360.cn>

commit | commitdiff | tree

Sunil Kumar Nagaraju [Tue, 25 May 2021 03:30:07 +0000 (09:00 +0530)]

Merge pull request #41414 from sunilkumarn417/rh_downstream

qa/tasks/cephadm.py: Include bootstrap registry options for downstream

commit | commitdiff | tree

Kefu Chai [Tue, 25 May 2021 01:31:49 +0000 (09:31 +0800)]

Merge pull request #41473 from tchaikov/wip-doc-mgr-influx

doc/mgr/influx: use :confval: directive

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

Kefu Chai [Tue, 25 May 2021 01:11:52 +0000 (09:11 +0800)]

Merge pull request #41512 from liu-chunmei/crimson-fix-build-error

crimson/seastore: fix build error.

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 21 May 2021 07:21:48 +0000 (15:21 +0800)]

doc/mgr/influx: use :confval: directive

less repeating this way

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Neha Ojha [Mon, 24 May 2021 21:44:18 +0000 (14:44 -0700)]

Merge pull request #41487 from neha-ojha/wip-toc

qa/suites/rados/thrash-old-clients: remove luminous and mimic and use centos_latest

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

chunmei-liu [Mon, 24 May 2021 21:20:19 +0000 (14:20 -0700)]

crimson/seastore: fix build error.

Signed-off-by: chunmei-liu <chunmei.liu@intel.com>

commit | commitdiff | tree

Neha Ojha [Mon, 24 May 2021 19:53:46 +0000 (12:53 -0700)]

Merge pull request #41486 from neha-ojha/wip-49139-new

qa: use ubuntu_latest for perf suites and remove cosbench workloads

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Mon, 24 May 2021 19:21:00 +0000 (12:21 -0700)]

Merge pull request #41504 from yuriw/wip-yuriw-master

qa/tests - removed ref to 18.04 distro as it's not supported on master+

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Ernesto Puerta [Mon, 24 May 2021 18:39:53 +0000 (20:39 +0200)]

Merge pull request #41430 from rhcs-dashboard/fix-api-docs-link

mgr/dashboard: fix API docs link

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

commit | commitdiff | tree

Ernesto Puerta [Mon, 24 May 2021 18:37:25 +0000 (20:37 +0200)]

Merge pull request #41426 from rhcs-dashboard/drop-container-image-columns

mgr/dashboard: drop container image name and id from services list

Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

commit | commitdiff | tree

Neha Ojha [Mon, 24 May 2021 16:45:47 +0000 (16:45 +0000)]

qa/suites/rados/thrash-old-clients: use centos_latest.yaml

use centos_latest instead of bionic because this is only common
distro for which we build packages for nautilus and above.

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Mon, 24 May 2021 11:15:51 +0000 (11:15 +0000)]

crimson/osd: fix assertion failure in OpSequencer.

`OpSequencer` assumes that ID of a previous client request
is always lower than ID of current one. This is reflected
by the assertion in `OpSequencer::start_op()`. It triggered
the following failure [1] in Teuthology:

```
DEBUG 2021-05-07 08:01:41,227 [shard 0] osd - client_request(id=1, detail=osd_op(client.4171.0:1 2.2 2.7c339972 (undecoded) ondisk+retry+read+known_if_redirected e29) v8) same_interval_since: 31
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-3910-g1b18e076/rpm/el8/BUILD/ceph-
17.0.0-3910-g1b18e076/src/crimson/osd/osd_operation_sequencer.h:38: seastar::futurize_t<Result> crimson::osd::OpSequencer::start_op(HandleT&, uint64_t, uint64_t, FuncT&&) [with HandleT = crimson::PipelineHa
ndle; FuncT = crimson::interruptible::interruptor<InterruptCond>::wrap_function(Func&&) [with Func = crimson::osd::ClientRequest::start()::<lambda()> mutable::<lambda(Ref<crimson::osd::PG>)> mutable::<lambd
a()> mutable::<lambda()>; InterruptCond = crimson::osd::IOInterruptCondition]::<lambda()>; Result = crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, seastar::future<>
>; seastar::futurize_t<Result> = crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, seastar::future<> >; uint64_t = long unsigned int]: Assertion `prev_op < this_op' fai
led.
Aborting on shard 0.
Backtrace:
Segmentation fault.
Backtrace:
0# 0x00005592B028932F in ceph-osd
1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
3# 0x00007F57B72E7B20 in /lib64/libpthread.so.0
4# gsignal in /lib64/libc.so.6
5# abort in /lib64/libc.so.6
6# 0x00007F57B58E2B09 in /lib64/libc.so.6
7# 0x00007F57B58F0DE6 in /lib64/libc.so.6
8# 0x00005592ABB8484D in ceph-osd
9# 0x00005592ABB8ACB3 in ceph-osd
10# seastar::continuation<seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>, seastar::future<boost::intrusive_ptr<crimson::osd::PG> >::then_impl_nrvo<seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>, seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > >(seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>&&)::{lambda(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>&, seastar::future_state<boost::intrusive_ptr<crimson::osd::PG> >&&)#1}, boost::intrusive_ptr<crimson::osd::PG> >::run_and_dispose() in ceph-osd
11# 0x00005592B357F88F in ceph-osd
12# 0x00005592B3584DD0 in ceph-osd
```

[1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-07_07:41:02-rados-master-distro-basic-smithi/6104530

Crash analysis resulted in two observations:
1. during the request execution the acting set got
   changed, the request was interrupted and a try
   to re-execute it emerged;
2. the interrupted request was the very first client
   request the OSD has ever seen.

Code analysis showed a problem in how `ClientRequest`
establishes `prev_op_id`: although supposed to be performed
only once for a request, it can get executed twice but only
for the very first request `OpSequencer` saw.

```cpp
void ClientRequest::may_set_prev_op()
{
  // set prev_op_id if it's not set yet
  if (__builtin_expect(prev_op_id == 0, true)) {
    prev_op_id = sequencer.get_last_issued();
  }
}
```

Unfortunately, `0` isn't a distincted value that cannot
be returned by `get_last_issued()`:

```cpp
class OpSequencer {
  // ...

  uint64_t get_last_issued() const {
    return last_issued;
  }

  // ...

  // the id of last op which is issued
  uint64_t last_issued = 0;
```

As a result, `OpSequencer` returned on the second call
a new value (actually `this_op`) violating the assertion.
The commit fixes the problem by switching from a designated
value to `std::optional`.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Mon, 24 May 2021 17:40:23 +0000 (10:40 -0700)]

qa/tests - removed ref to 18.04 distro as it's not supported on master+

Signed-off-by: Yuri Weinstein <yweinste@redhat.com>

commit | commitdiff | tree

Sage Weil [Mon, 24 May 2021 17:30:27 +0000 (13:30 -0400)]

Merge PR #41451 into master

* refs/pull/41451/head:
qa/suites/rados: include rook test in rados

Reviewed-by: Yuri Weinstein <yweins@redhat.com>

commit | commitdiff | tree

Avan Thakkar [Wed, 19 May 2021 23:57:29 +0000 (05:27 +0530)]

mgr/dashboard: fix API docs link

Fixes: https://tracker.ceph.com/issues/50890
Signed-off-by: Avan Thakkar <athakkar@redhat.com>

commit | commitdiff | tree

Ilya Dryomov [Mon, 24 May 2021 08:14:38 +0000 (10:14 +0200)]

Merge pull request #41489 from onitopl/rbd_mirroring_doc

doc/rbd: add missing snapshot in command line examples

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

commit | commitdiff | tree

Kefu Chai [Mon, 24 May 2021 07:55:02 +0000 (15:55 +0800)]

Merge pull request #41492 from tchaikov/wip-mon_data_avail_crit-in-ceph.conf

vstart.sh: specify mon_data_avail_crit in ceph.conf

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

sunilkumarn417 [Wed, 19 May 2021 10:02:45 +0000 (15:32 +0530)]

qa/tasks/cephadm: Include bootstrap registry options for downstream
- registry-url, registry-username and registry-password bootstrap options are
supported now. This is needed to access monitoring service container images.
- usage of RHEL distribution based cephadm in download_cephadm task.

Signed-off-by: sunilkumarn417 <sunnagar@redhat.com>

commit | commitdiff | tree

Kefu Chai [Mon, 24 May 2021 02:45:33 +0000 (10:45 +0800)]

Merge pull request #41363 from Aran85/crimson-fix-syntax

crimson/seastore: remove unused method

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Mon, 24 May 2021 02:21:52 +0000 (10:21 +0800)]

vstart.sh: specify mon_data_avail_crit in ceph.conf

ceph-mon consumes this option when it boots, and exits if the ratio
of free space is lower than the specified number, which is 5% by
default. but we use `ceph -c $conf_fn config assimilate-conf -i -`
to absorb these option after monitor starts. so, without this change,
the default value of mon_data_avail_crit is always used, if machine
has lower ratio of free space on the partition where mon store is
located, ceph-mon just exists with the error message like:

2021-05-24T01:53:14.644+0000 7ff64961e580 -1 error: monitor data
filesystem reached concerning levels of available storage space
(available: 4% 17 GiB)

after this change, the option is written in ceph.conf, and can be
read by ceph-mon when it boots. so the overriden value of 1% has
the chance to take effect. this helps to address some test failures
found in our "make check" runs performed by jenkins on machines whose
disk space is enough for completing the test, but its ratio of free
space is lower than 5%.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Aran85 [Mon, 17 May 2021 12:57:38 +0000 (20:57 +0800)]

crimson/seastore: remove unused method

Signed-off-by: Zengran Zhang <zhangzengran@sangfor.com.cn>

commit | commitdiff | tree

Grzegorz Wieczorek [Sat, 22 May 2021 15:19:04 +0000 (17:19 +0200)]

doc/rbd: add missing snapshot in command line examples

Signed-off-by: Grzegorz Wieczorek <grzegorz.wieczorek@onito.pl>

commit | commitdiff | tree

Ilya Dryomov [Sun, 23 May 2021 14:32:19 +0000 (16:32 +0200)]

Merge pull request #41480 from MrFreezeex/fix-segfault-replayer-snapshot-shutdown

rbd-mirror: fix segfault in snapshot replayer shutdown

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

commit | commitdiff | tree

Kefu Chai [Sun, 23 May 2021 00:43:07 +0000 (08:43 +0800)]

Merge pull request #41433 from tchaikov/wip-50891

os/bluestore/bluestore_tool: compare retval stat() with -1

Reviewed-by: Igor Fedotov <ifedotov@suse.com>

commit | commitdiff | tree

Kefu Chai [Sun, 23 May 2021 00:42:03 +0000 (08:42 +0800)]

Merge pull request #41429 from ifed01/wip-ifed-fix-repair-multithreading

os/bluestore: introduce multithireading sync for bluestore's repairer

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>

commit | commitdiff | tree

Kefu Chai [Sun, 23 May 2021 00:39:33 +0000 (08:39 +0800)]

Merge pull request #41436 from runsisi/wip-fix-unit

common/options: fix option type for bluestore_block_db_size

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Sun, 23 May 2021 00:38:25 +0000 (08:38 +0800)]

Merge pull request #41466 from ansiwen/install-cepfs-headers

include/cephfs: add cephfs headers to CMakeLists.txt

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Wed, 19 May 2021 12:28:31 +0000 (12:28 +0000)]

common: introduce std::optional-returning variants of cmd_getval().

Using an output paramter instead of returning is confusing but
common in pre-C++11 code. Let's modernize `cmd_getval()`.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Kefu Chai [Thu, 20 May 2021 03:00:06 +0000 (11:00 +0800)]

common/cmdparse: return cmd option using return value

instead of

- always returning "true"
- returning using an input parameter

just return the value with retval. simpler this way.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Thu, 20 May 2021 02:27:37 +0000 (10:27 +0800)]

common/cmdparse: use map::find() only a single time

instead of using the combo of

if (map.count(key)) {
return map.find(key)->second;
}

just use

found = map.find(key);
if (found != map.end()) {
return found->second;
}

to avoid repeating the lookup in the map with the same key.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Thu, 20 May 2021 02:20:50 +0000 (10:20 +0800)]

common/cmdparse: use string_view for the key

for better usability and performance. as the main use case of
cmd_getval() and cmd_putval() only uses a literal string for the key.
it's a waste to build a std::string out of it and throw it away after
looking the cmdmap with it.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Matt Benjamin [Fri, 21 May 2021 23:17:53 +0000 (19:17 -0400)]

Merge pull request #40910 from galsalomon66/update_s3select_submodule_and_s3tests_head_10apr

rgw/s3select: update s3select submodule to last commit, (new features), update for test coverage(s3test)

commit | commitdiff | tree

Sage Weil [Fri, 21 May 2021 22:22:30 +0000 (18:22 -0400)]

Merge PR #41479 into master

* refs/pull/41479/head:
qa/tasks/cephadm.conf: log_to_journald=false

Reviewed-by: Sebastian Wagner <swagner@suse.com>

commit | commitdiff | tree

Neha Ojha [Fri, 21 May 2021 21:38:24 +0000 (21:38 +0000)]

qa/suites/rados/thrash-old-clients: remove luminous and mimic

We support N-3 client versions.

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Neha Ojha [Fri, 21 May 2021 20:17:11 +0000 (20:17 +0000)]

qa: remove cosbench workloads from perf suites

Due to https://tracker.ceph.com/issues/49139

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Samuel Just [Fri, 21 May 2021 19:26:53 +0000 (12:26 -0700)]

Merge pull request #41472 from cyx1231st/wip-seastore-onode-tree-errorhandling

crimson/onode-staged-tree: tolerate eagain and add proper errorhandling

Reviewed-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

gal salomon [Mon, 19 Apr 2021 11:54:15 +0000 (14:54 +0300)]

update to s3select/master (new features)

Signed-off-by: gal salomon <gal.salomon@gmail.com>
force-branch to s3test/master

Signed-off-by: gal salomon <gal.salomon@gmail.com>
update of s3select (== -> =)

Signed-off-by: gal salomon <gal.salomon@gmail.com>
update of s3select (== -> =)

Signed-off-by: gal salomon <gal.salomon@gmail.com>
closing documentation gaps between previous and current functionalities(WIP)

Signed-off-by: gal salomon <gal.salomon@gmail.com>
Update s3select.rst

additional features

Update s3select.rst

Signed-off-by: gal salomon <gal.salomon@gmail.com>
Update s3select.rst

editorial
Signed-off-by: gal salomon <gal.salomon@gmail.com>
Update s3select.rst

editorial
Signed-off-by: gal salomon <gal.salomon@gmail.com>
Update s3select.rst

adding cast to bool
Signed-off-by: gal salomon <gal.salomon@gmail.com>
editorial; more description for substring and trim

Signed-off-by: gal salomon <gal.salomon@gmail.com>

commit | commitdiff | tree

Patrick Donnelly [Fri, 21 May 2021 18:16:36 +0000 (11:16 -0700)]

Merge PR #41441 into master

* refs/pull/41441/head:
.github/labeler: add nfs label

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Sebastian Wagner <swagner@suse.com>

commit | commitdiff | tree

Neha Ojha [Fri, 21 May 2021 16:15:37 +0000 (16:15 +0000)]

qa: use ubuntu_latest for perf suites

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Arthur Outhenin-Chalandre [Fri, 21 May 2021 15:05:24 +0000 (17:05 +0200)]

rbd-mirror: fix segfault in snapshot replayer shutdown

If an error arises in the init flow of the snapshot replayer and the
function returns before the call on `register_local_update_watcher`
the value of `m_update_watch_ctx` will not be initialized. Therefore,
on the shutdown phase, the replayer will try to free this pointer
and segfault.

This commit fixes this issue by setting `m_update_watch_ctx` to
`nullptr`.

Fixes: https://tracker.ceph.com/issues/50931
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>

commit | commitdiff | tree

Yuri Weinstein [Fri, 21 May 2021 15:34:21 +0000 (08:34 -0700)]

Merge pull request #41464 from jdurgin/wip-bib

script/build-integration-branch: always generate merge commits

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 21 May 2021 14:26:10 +0000 (22:26 +0800)]

Merge pull request #41476 from tchaikov/wip-crimson-options

crimson/osd: disable allow_guessing when parsing command line options

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Sage Weil [Fri, 21 May 2021 13:51:47 +0000 (09:51 -0400)]

qa/tasks/cephadm.conf: log_to_journald=false

For teuthology runs, we set log_to_stderr=false, so that we only see
derr-level events in the container log (and teuthology.log). Now that we
log directly to journald, set log_to_journald=false too, so that we don't
see level-20 logs in teuthology.log.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Ernesto Puerta [Fri, 21 May 2021 13:31:45 +0000 (15:31 +0200)]

Merge pull request #41474 from rhcs-dashboard/fix-50918-master

mgr/dashboard: remove non-null id in Grafana dashboard

Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 21 May 2021 12:10:38 +0000 (20:10 +0800)]

crimson/osd: disable allow_guessing when parsing command line options

we pass "--id <n>" to ceph-osd for specifying the osd id, but seastar
app template also provides an option of "--idle-poll-time-us arg".
boost::program_option::command_line_parser() uses default_style when
parsing options. and default_style includes allow_guessing, which in
turn matches partial option as well, so "--id" matches with "--idle"
when we are trying to figure out which options are consumed by seastar
app template, and which are not. see
https://www.boost.org/doc/libs/1_76_0/doc/html/boost/program_options/command_line_style/style_t.html

so, in this change, stype is specified explicitly, and "allow_guessing"
is removed from the "default_style" before being passed to style(), so
that only the full option name are matched.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Ernesto Puerta [Fri, 21 May 2021 08:57:23 +0000 (10:57 +0200)]

mgr/dashboard: remove non-null id in Grafana dashb

Testing added to prevent this situation.

Fixes: https://tracker.ceph.com/issues/50918
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>

commit | commitdiff | tree

Kefu Chai [Thu, 20 May 2021 05:46:59 +0000 (13:46 +0800)]

os/bluestore/bluestore_tool: use std::filesystem

for better readability

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 21 May 2021 04:10:50 +0000 (04:10 +0000)]

os/bluestore/bluestore_tool: use boost::filesystem as an alternative

the libstdc++ shipped with GCC 7.5 does not have good support of
std::filesystem, among other things, it does not offer
std::filesystem::weakly_canonical(). but boost::filesystem does.
and boost::filesystem is compatible with std::filesystem to some
degree. so let's use it if <filesystem> is not available, we can
take it as a signal that std::filesystem is not quite ready yet.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Yingxin Cheng [Fri, 21 May 2021 06:42:50 +0000 (14:42 +0800)]

crimson/onode-staged-tree: detect errors from seastore backend

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Yingxin Cheng [Fri, 21 May 2021 06:41:33 +0000 (14:41 +0800)]

crimson/onode-staged-tree: add asserts

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Yingxin Cheng [Fri, 21 May 2021 06:37:40 +0000 (14:37 +0800)]

crimson/onode-staged-tree: tolerate eagain during fix_index()

Fix the one-directional link during fix_index() which causes errors
during node destruction under interruptive eagain.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Yingxin Cheng [Thu, 20 May 2021 07:27:19 +0000 (15:27 +0800)]

crimson/onode-staged-tree: cleanup Node tracking logic for eagain

Introduce deref_super/parent() to make sure the bi-directional links
are reset together to survive eagain.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Yingxin Cheng [Thu, 20 May 2021 03:15:49 +0000 (11:15 +0800)]

crimson/onode-staged-tree: reduce unit test efforts

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Yingxin Cheng [Thu, 20 May 2021 03:14:31 +0000 (11:14 +0800)]

crimson/onode-staged-tree: validate insert/lookup/erase with eagain

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Yingxin Cheng [Thu, 20 May 2021 01:46:17 +0000 (09:46 +0800)]

crimson/onode-staged-tree: fix initialization in perf tool

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Yingxin Cheng [Wed, 19 May 2021 08:26:10 +0000 (16:26 +0800)]

crimson/onode-staged-tree: free resources when call submit_transaction()

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Yingxin Cheng [Wed, 19 May 2021 08:23:02 +0000 (16:23 +0800)]

crimson/onode-staged-tree: tolerate eagain between extent allocation and node initialization

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Yingxin Cheng [Wed, 19 May 2021 07:36:53 +0000 (15:36 +0800)]

crimson/onode-staged-tree: distinguish extent state between retired and invalid

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom