]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
Yingxin Cheng [Tue, 7 Dec 2021 08:48:03 +0000 (16:48 +0800)]
crimson/os/seastore: measure the number of conflicting transactions by srcs
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
Yingxin Cheng [Mon, 6 Dec 2021 08:33:47 +0000 (16:33 +0800)]
crimson/os/seastore: differentiate cleaner trim/reclaim transactions
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
Neha Ojha [Tue, 7 Dec 2021 21:34:15 +0000 (13:34 -0800)]
Merge pull request #44095 from Matan-B/wip-matanb-local-workunits
doc/dev: Running workunits locally
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Yuval Lifshitz [Tue, 7 Dec 2021 19:20:33 +0000 (21:20 +0200)]
Merge pull request #43940 from TRYTOBE8TME/wip-rgw-empty-config
src/rgw: Empty configuration support
Yuval Lifshitz [Tue, 7 Dec 2021 19:19:27 +0000 (21:19 +0200)]
Merge pull request #43665 from zenomri/wip-omri-multipart-trace
rgw/tracer: Multipart upload trace
Samuel Just [Tue, 7 Dec 2021 15:53:21 +0000 (07:53 -0800)]
Merge pull request #44156 from rzarzynski/wip-crimson-fix-process_op-sequencing
crimson/osd: fix sequencing issues in ClientRequest::process_op.
Reviewed-by: Samuel Just <sjust@redhat.com>
Samuel Just [Tue, 7 Dec 2021 15:52:41 +0000 (07:52 -0800)]
Merge pull request #44223 from rzarzynski/wip-crimson-fix-pullinfo-on-push
crimson/osd: don't assume a pull must happen if there is no push.
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Samuel Just [Tue, 7 Dec 2021 15:52:02 +0000 (07:52 -0800)]
Merge pull request #44224 from rzarzynski/wip-crimson-clean-msghs
crimson/osd: clean the recovery message-related header inclusion.
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
Samuel Just [Tue, 7 Dec 2021 15:49:08 +0000 (07:49 -0800)]
Merge pull request #44184 from rzarzynski/wip-crimson-internal_client_request-fix-hobj
crimson/osd: fix assertion failure in InternalClientRequest.
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Alfonso Martínez [Tue, 7 Dec 2021 14:02:42 +0000 (15:02 +0100)]
Merge pull request #44145 from rhcs-dashboard/fix-frontend-vulnerabilities
mgr/dashboard: fix frontend deps' vulnerabilities
Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Radoslaw Zarzynski [Tue, 30 Nov 2021 19:30:32 +0000 (19:30 +0000)]
crimson/osd: fix sequencing issues in ClientRequest::process_op.
The following crash has been observed in one of the runs at Sepia:
```
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/
17.0.0-8898-ge57ad63c /rpm/el8/BUILD/
ceph-17.0.0-8898-ge57ad63c /src/crimson/osd/osd_operation_sequencer.h:123: void crimson::osd::OpSequencer::finish_op_in_order(crimson::osd::ClientRequest&): Assertion `op.get_id() > last_completed_id' failed.
Aborting on shard 0.
Backtrace:
Reactor stalled for 1807 ms on shard 0. Backtrace: 0xb14ab 0x46e57428 0x46bc450d 0x46be03bd 0x46be0782 0x46be0946 0x46be0bf6 0x12b1f 0x137341 0x3fdd6a92 0x3fddccdb 0x3fdde1ee 0x3fdde8b3 0x3fdd3f2b 0x3fdd4442 0x3fdd4c3a 0x12b1f 0x3737e 0x21db4 0x21c88 0x2fa75 0x3b769527 0x3b8418af 0x3b8423cb 0x3b842ce0 0x3b84383d 0x3a116220 0x3a143f31 0x3a144bcd 0x46b96271 0x46bde51a 0x46d6891b 0x46d6a8f0 0x4681a7d2 0x4681f03b 0x39fd50f2 0x23492 0x39b7a7dd
0# gsignal in /lib64/libc.so.6
1# abort in /lib64/libc.so.6
2# 0x00007FB9FB946C89 in /lib64/libc.so.6
3# 0x00007FB9FB954A76 in /lib64/libc.so.6
4# 0x00005595E98E6528 in ceph-osd
5# 0x00005595E99BE8B0 in ceph-osd
6# 0x00005595E99BF3CC in ceph-osd
7# 0x00005595E99BFCE1 in ceph-osd
8# 0x00005595E99C083E in ceph-osd
9# 0x00005595E8293221 in ceph-osd
10# 0x00005595E82C0F32 in ceph-osd
11# 0x00005595E82C1BCE in ceph-osd
12# 0x00005595F4D13272 in ceph-osd
13# 0x00005595F4D5B51B in ceph-osd
14# 0x00005595F4EE591C in ceph-osd
15# 0x00005595F4EE78F1 in ceph-osd
16# 0x00005595F49977D3 in ceph-osd
17# 0x00005595F499C03C in ceph-osd
18# main in ceph-osd
19# __libc_start_main in /lib64/libc.so.6
20# _start in ceph-osd
```
The sequence of events provides at least two clues:
- the op no. 32 finished before the op no. 29 which was waiting
for `ObjectContext`,
- the op no. 29 was a short-living one -- it wasn't waiting even
on `obc`.
```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-11-22_22:01:32-rados-master-distro-basic-smithi$ less ./
6520106 /remote/smithi115/log/ceph-osd.3.log.gz
...
DEBUG 2021-11-22 22:32:24,531 [shard 0] osd - client_request(id=29, detail=m=[osd_op(client.4371.0:36 4.d 4.
f0fb5e1d (undecoded) ondisk+retry+read+rwordered+known_if_redirected+supports_pool_eio e23) v8]): start
DEBUG 2021-11-22 22:32:24,531 [shard 0] osd - client_request(id=29, detail=m=[osd_op(client.4371.0:36 4.d 4.
f0fb5e1d (undecoded) ondisk+retry+read+rwordered+known_if_redirected+supports_pool_eio e23) v8]): in repeat
...
DEBUG 2021-11-22 22:32:24,546 [shard 0] osd - client_request(id=29, detail=m=[osd_op(client.4371.0:36 4.d 4.
f0fb5e1d (undecoded) ondisk+retry+read+rwordered+known_if_redirected+supports_pool_eio e23) v8]) same_interval_since: 21
DEBUG 2021-11-22 22:32:24,546 [shard 0] osd - OpSequencer::start_op: op=29, last_started=27, last_unblocked=27, last_completed=27
...
DEBUG 2021-11-22 22:32:24,621 [shard 0] osd - client_request(id=32, detail=m=[osd_op(client.4371.0:49 4.d 4.
81addbad (undecoded) ondisk+retry+write+known_if_redirected+supports_pool_eio e23) v8]): start
DEBUG 2021-11-22 22:32:24,621 [shard 0] osd - client_request(id=32, detail=m=[osd_op(client.4371.0:49 4.d 4.
81addbad (undecoded) ondisk+retry+write+known_if_redirected+supports_pool_eio e23) v8]): in repeat
...
DEBUG 2021-11-22 22:32:24,626 [shard 0] osd - client_request(id=32, detail=m=[osd_op(client.4371.0:49 4.d 4.
81addbad (undecoded) ondisk+retry+write+known_if_redirected+supports_pool_eio e23) v8]) same_interval_s
ince: 21
DEBUG 2021-11-22 22:32:24,626 [shard 0] osd - OpSequencer::start_op: op=32, last_started=29, last_unblocked=29, last_completed=27
<note that op 32 is very short living>
DEBUG 2021-11-22 22:32:24,669 [shard 0] osd - OpSequencer::finish_op_in_order: op=32, last_started=32, last_unblocked=32, last_completed=27
...
DEBUG 2021-11-22 22:32:24,671 [shard 0] osd - client_request(id=32, detail=m=[osd_op(client.4371.0:49 4.d 4:
b5dbb581 :::smithi11538976-13:head {write 601684~619341 in=
619341b , stat} snapc 0={} RETRY=1 ondisk+retry+write+known_if_redirected+supports_pool_eio e23) v8]): destroying
...
DEBUG 2021-11-22 22:32:24,722 [shard 0] osd - client_request(id=29, detail=m=[osd_op(client.4371.0:36 4.d 4:
b87adf0f :::smithi11538976-9:head {read 0~1} snapc 0={} RETRY=1 ondisk+retry+read+rwordered+known_if_redirected+supports_pool_eio e23) v8]): got obc lock
...
INFO 2021-11-22 22:32:24,723 [shard 0] osd - client_request(id=29, detail=m=[osd_op(client.4371.0:36 4.d 4:
b87adf0f :::smithi11538976-9:head {read 0~1} snapc 0={} RETRY=1 ondisk+retry+read+rwordered+known_if_redirected+supports_pool_eio e23) v8]) obc.get()=0x6190000d5780
...
DEBUG 2021-11-22 22:32:24,753 [shard 0] osd - OpSequencer::finish_op_in_order: op=29, last_started=32, last_unblocked=32, last_completed=32
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/
17.0.0-8898-ge57ad63c /rpm/el8/BUILD/
ceph-17.0.0-8898-ge57ad63c /src/crimson/osd/osd_operation_sequencer.h:123: void crimson::osd::OpSequencer::finish_op_in_order(crimson::osd::ClientRequest&): Assertion `op.get_id() > last_completed_id' failed.
Aborting on shard 0.
```
This could be explained in a scenario where:
- op no. 29 skipped stages of the execution pipeline while
- it wrongly informed `OpSequencer` the execution was in-order.
Static analysis shows there are multiple problems of this genre
in the `ClientRequest::process_op()` and its callees with the most
recently merged one being the path for `PG::already_complete()`.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Samuel Just [Tue, 7 Dec 2021 06:21:55 +0000 (22:21 -0800)]
Merge pull request #44231 from xxhdx1985126/wip-cpu-profile
crimson/os/seastore: fix compiler error for gcc > 9 and clang13
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
Xuehan Xu [Tue, 7 Dec 2021 04:27:14 +0000 (12:27 +0800)]
crimson/os/seastore: fix compiler error for gcc > 9 and clang13
Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
David Galloway [Mon, 6 Dec 2021 17:58:22 +0000 (12:58 -0500)]
Merge pull request #44222 from ceph/wip-m2r
doc: Use older mistune
Radoslaw Zarzynski [Mon, 6 Dec 2021 17:29:39 +0000 (17:29 +0000)]
crimson/osd: clean the recovery message-related header inclusion.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Radoslaw Zarzynski [Mon, 6 Dec 2021 16:54:41 +0000 (16:54 +0000)]
crimson/osd: don't assume a pull must happen if there is no push.
In the classical OSD the `ReplicatedRecoveryBackend::recover_object()`
divides into two main flows: pull and push:
```cpp
int ReplicatedBackend::recover_object(
const hobject_t &hoid,
// ...
)
{
dout(10) << __func__ << ": " << hoid << dendl;
RPGHandle *h = static_cast<RPGHandle *>(_h);
if (get_parent()->get_local_missing().is_missing(hoid)) {
ceph_assert(!obc);
// pull
prepare_pull(
v,
hoid,
head,
h);
} else {
ceph_assert(obc);
int started = start_pushes(
hoid,
obc,
h);
// ...
}
return 0;
}
```
Pulls may also enter the push path (`C_ReplicatedBackend_OnPullComplete`)
but push handling doesn't draw any assumption on that. What's important,
`recover_object()` may result in no pulls and pushes.
This isn't the case of crimson as its implementation of the push path
asserts that, if no push is scheduled, `PullInfo` must be allocated.
This patch reworks this logic to reflects the classical one and to avoid
crashes like the following one:
```
DEBUG 2021-12-01 18:43:00,220 [shard 0] osd - recover_object: loaded obc: 3:
4e058a2e :::smithi13839607-45:head
WARN 2021-12-01 18:43:00,220 [shard 0] none - intrusive_ptr_add_ref(p=0x6190000d7f80, use_count=3)
WARN 2021-12-01 18:43:00,220 [shard 0] none - intrusive_ptr_release(p=0x6190000d7f80, use_count=4)
TRACE 2021-12-01 18:43:00,220 [shard 0] osd - call_with_interruption_impl clearing interrupt_cond: 0x60300012b210,N7crimson3osd20IOInterruptConditionE
TRACE 2021-12-01 18:43:00,220 [shard 0] osd - call_with_interruption_impl: may_interrupt: false, local interrupt_condintion: 0x60300012b210, global interrupt_cond: 0x0,N7crimson3osd20IOInterruptConditionE
TRACE 2021-12-01 18:43:00,220 [shard 0] osd - set: interrupt_cond: 0x60300012b210, ref_count: 1
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/
17.0.0-8902-g52fd47fe /rpm/el8/BUILD/ceph-17.0.
0-8902-g52fd47fe /src/crimson/osd/replicated_recovery_backend.cc:84: ReplicatedRecoveryBackend::maybe_push_shards(const hobject_t&, eversion_t)::<lambda()>: Assertion `recovery.pi' failed.
Aborting on shard 0.
```
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
David Galloway [Mon, 6 Dec 2021 15:32:56 +0000 (10:32 -0500)]
doc: Use older mistune
https://github.com/miyakogi/m2r/issues/66
Signed-off-by: David Galloway <dgallowa@redhat.com>
benhanokh [Sun, 5 Dec 2021 07:47:49 +0000 (09:47 +0200)]
Merge pull request #43870 from benhanokh/restore_alloc_file
NCB::refresh allocation-file after FSCK remove
Gabriel BenHanokh [Mon, 8 Nov 2021 17:12:40 +0000 (19:12 +0200)]
BlueStore: Fix a bug when FSCK is invoked in mount()/umount()/mkfs() with DEEP option
Fixes: https://tracker.ceph.com/issues/53185
NCB mishandles fsck DEEP in mount()/umount()/mkfs() case causing it to remove the allocation-file without destaging a new copy (which will cost us a full rebuild on startup)
There are also few confiliting calls to open_db()/close_db() passing inconsistent read-only flag
We fix both issues by storing open-db type (read-only/read-write) and using it for close-db (which won't pass read-only flag anymore)
We also move allocation-file destage to close-db so it will be refreshed after being removed by fsck and such
Signed-off-by: Gabriel Benhanokh <gbenhano@redhat.com>
Sage Weil [Sat, 4 Dec 2021 17:52:45 +0000 (12:52 -0500)]
Merge PR #44155 into master
* refs/pull/44155/head:
mgr: limit changes to pg_num
Reviewed-by: Neha Ojha <nojha@redhat.com>
Sage Weil [Sat, 4 Dec 2021 17:52:09 +0000 (12:52 -0500)]
Merge PR #44108 into master
* refs/pull/44108/head:
mgr: fix locking for MetadataUpdate::finish
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Mykola Golub [Sat, 4 Dec 2021 16:25:02 +0000 (18:25 +0200)]
Merge pull request #44204 from songtongshuai/sts_ceph
test/librbd: add get_group test
Reviewed-by: Mykola Golub <mgolub@suse.com>
zdover23 [Sat, 4 Dec 2021 05:53:42 +0000 (15:53 +1000)]
Merge pull request #44189 from zdover23/wip-doc-2021-12-02-documenting-ceph
doc/start: update documenting-ceph.rst (1 of x)
Reviewed-by: Laura Flores <lflores@redhat.com>
Sage Weil [Fri, 3 Dec 2021 17:42:57 +0000 (12:42 -0500)]
Merge PR #44017 into master
* refs/pull/44017/head:
mgr/cephadm: Do not propogate access logs from cherrypy
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Matt Benjamin [Fri, 3 Dec 2021 17:22:20 +0000 (12:22 -0500)]
Merge pull request #44139 from linuxbox2/wip-rgw-lcselect
rgwlc: permit lifecycle processing for a single bucket
Sage Weil [Fri, 3 Dec 2021 17:05:13 +0000 (12:05 -0500)]
Merge PR #44132 into master
* refs/pull/44132/head:
mgr/prometheus: define module options for standby
Reviewed-by: Laura Flores <lflores@redhat.com>
songtongshuai_yewu [Fri, 3 Dec 2021 14:29:03 +0000 (09:29 -0500)]
test/librbd: add get_group test
Signed-off-by: songtongshuai_yewu <songtongshuai_yewu@cmss.chinamobile.com>
Zac Dover [Thu, 2 Dec 2021 14:15:44 +0000 (00:15 +1000)]
doc/start: update documenting-ceph.rst (1 of x)
This PR updates the content on documenting-ceph,
which is, as of December 2021, in need of an
update.
This is the first of what I estimate to be three
to five PRs against this .rst file.
Signed-off-by: Zac Dover <zac.dover@gmail.com>
Deepika Upadhyay [Fri, 3 Dec 2021 09:35:12 +0000 (15:05 +0530)]
Merge pull request #44103 from majianpeng/librbd-pwl-flush-by-blockgurad
librbd/cache/pwl: Using BlockGuard control overlap ops order when flu…
Reviewed-by: Mykola Golub mgolub@suse.com
Reviewed-by: Sunny Kumar <sunkumar@redhat.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Deepika Upadhyay [Fri, 3 Dec 2021 06:21:08 +0000 (11:51 +0530)]
Merge pull request #44144 from majianpeng/librbd-fix-discard-granularity
librbd: fix discard granularity for pwl cache
Reviewed-by: mgolub@suse.com
Reviewed-by: Sunny Kumar <sunkumar@redhat.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
zdover23 [Thu, 2 Dec 2021 23:37:51 +0000 (09:37 +1000)]
Merge pull request #43848 from dvanders/doc_bench
doc: add disk benchmarking and cache tuning recommendations
Reviewed-by: Zac Dover <zac.dover@gmail.com>
Yuri Weinstein [Thu, 2 Dec 2021 23:28:17 +0000 (15:28 -0800)]
Merge pull request #43691 from curtbruns/use_optimal_for_min_alloc
os/bluestore: Set min_alloc_size to optimal io size
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
Yuri Weinstein [Thu, 2 Dec 2021 23:27:18 +0000 (15:27 -0800)]
Merge pull request #43578 from ronen-fr/wip-rf-log-options
common: hide internal logger configuration strings from clients
Reviewed-by: Laura Flores <lflores@redhat.com>
Yuri Weinstein [Thu, 2 Dec 2021 23:26:24 +0000 (15:26 -0800)]
Merge pull request #43493 from myoungwon/wip-52872
test: increase retry duration when calculating manifest ref. count
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
Yuri Weinstein [Thu, 2 Dec 2021 23:25:10 +0000 (15:25 -0800)]
Merge pull request #43944 from aclamk/wip-aclamk-fix-crush-location-hook-50659
crush: Fix segfault in update_from_hook
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Adam Emerson <aemerson@redhat.com>
Yuri Weinstein [Thu, 2 Dec 2021 20:23:28 +0000 (12:23 -0800)]
Merge pull request #44194 from ceph/wip-yuriw-crontab-master
qa/tests: switch all gibba machines to smithi
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Daniel Gryniewicz [Thu, 2 Dec 2021 19:59:21 +0000 (14:59 -0500)]
Merge pull request #44178 from Huber-ming/admin_lc_fix
radosgw-admin: fix error message of OPT::LC_RESHARD_FIX
Yuri Weinstein [Thu, 2 Dec 2021 19:10:47 +0000 (11:10 -0800)]
qa/tests: switch all gibba machines to smithi
Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
Neha Ojha [Thu, 2 Dec 2021 16:39:41 +0000 (08:39 -0800)]
Merge pull request #43336 from ifed01/wip-fix-bluefs-volumes-ops
qa/osd-bluefs-volume-ops: fix bluefs volumes ops test case
Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
Adam King [Thu, 18 Nov 2021 23:54:33 +0000 (18:54 -0500)]
mgr/cephadm: Do not propogate access logs from cherrypy
The only messages we're really interested in are errors that
would come from the error logs. The acces logs only provide
messages like
- [18/Nov/2021:23:55:32] "POST /data HTTP/1.1" 200 - "" "Python-urllib/3.8"
which we don't want spammed to the log, especially since they are
logged at INFO level
Signed-off-by: Adam King <adking@redhat.com>
Sage Weil [Thu, 2 Dec 2021 14:57:40 +0000 (09:57 -0500)]
Merge PR #44035 into master
* refs/pull/44035/head:
mgr/cephadm: less log noise when config checks fail
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Alfonso Martínez [Thu, 2 Dec 2021 14:05:23 +0000 (15:05 +0100)]
mgr/dashboard: fix frontend deps' vulnerabilities
- Remove npm-force-resolutions: no resolution needed anymore and this is modifying package-lock.json every time it is run (striping last empty line).
- Add .npmrc: save exact version by default; do not launch audit report when installing.
Fixes: https://tracker.ceph.com/issues/48005
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
benhanokh [Thu, 2 Dec 2021 13:49:50 +0000 (15:49 +0200)]
Merge pull request #44089 from benhanokh/ncb_fsck_fix
os/bluestore: bug-fix for NCB-FSCK
Radoslaw Zarzynski [Wed, 1 Dec 2021 18:35:36 +0000 (18:35 +0000)]
crimson/osd: fix assertion failure in InternalClientRequest.
```
DEBUG 2021-12-01 07:55:10,541 [shard 0] osd - internal_client_request(id=1, detail=): in repeat
TRACE 2021-12-01 07:55:10,541 [shard 0] osd - with_interruption_cond: interrupt_cond: 0x0
TRACE 2021-12-01 07:55:10,541 [shard 0] osd - call_with_interruption_impl: may_interrupt: false, local interrupt_condintion: 0x603000b5d270, global interrupt_cond: 0x0,N7crimson3osd20IOInterruptConditionE
TRACE 2021-12-01 07:55:10,541 [shard 0] osd - set: interrupt_cond: 0x603000b5d270, ref_count: 1
TRACE 2021-12-01 07:55:10,541 [shard 0] osd - call_with_interruption_impl clearing interrupt_cond: 0x603000b5d270,N7crimson3osd20IOInterruptConditionE
TRACE 2021-12-01 07:55:10,541 [shard 0] osd - call_with_interruption_impl: may_interrupt: false, local interrupt_condintion: 0x603000b5d270, global interrupt_cond: 0x0,N7crimson3osd20IOInterruptConditionE
TRACE 2021-12-01 07:55:10,541 [shard 0] osd - set: interrupt_cond: 0x603000b5d270, ref_count: 1
TRACE 2021-12-01 07:55:10,541 [shard 0] osd - call_with_interruption_impl clearing interrupt_cond: 0x603000b5d270,N7crimson3osd20IOInterruptConditionE
TRACE 2021-12-01 07:55:10,541 [shard 0] osd - call_with_interruption_impl: may_interrupt: false, local interrupt_condintion: 0x603000b5d270, global interrupt_cond: 0x0,N7crimson3osd20IOInterruptConditionE
TRACE 2021-12-01 07:55:10,541 [shard 0] osd - set: interrupt_cond: 0x603000b5d270, ref_count: 1
TRACE 2021-12-01 07:55:10,541 [shard 0] osd - call_with_interruption_impl clearing interrupt_cond: 0x603000b5d270,N7crimson3osd20IOInterruptConditionE
TRACE 2021-12-01 07:55:10,542 [shard 0] osd - call_with_interruption_impl: may_interrupt: false, local interrupt_condintion: 0x603000b5d270, global interrupt_cond: 0x0,N7crimson3osd20IOInterruptConditionE
TRACE 2021-12-01 07:55:10,542 [shard 0] osd - set: interrupt_cond: 0x603000b5d270, ref_count: 1
DEBUG 2021-12-01 07:55:10,542 [shard 0] osd - do_recover_missing check for recovery, MIN
ERROR 2021-12-01 07:55:10,542 [shard 0] none - /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/
17.0.0-8902-g52fd47fe /rpm/el8/BUILD/
ceph-17.0.0-8902-g52fd47fe /src/crimson/osd/pg.cc:1195 : In function 'bool crimson::osd::PG::is_degraded_or_backfilling_object(const hobject_t&) const', ceph_assert(%s)
!get_acting_recovery_backfill().empty()
```
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Dan van der Ster [Mon, 8 Nov 2021 20:47:12 +0000 (21:47 +0100)]
doc: add disk benchmarking and cache recommendations
Advise operators on how to benchmark devices for BlueStore, and how to
tune the volatile write cache for optimal OSD performance.
Fixes: https://tracker.ceph.com/issues/53161
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
Venky Shankar [Thu, 2 Dec 2021 05:14:55 +0000 (10:44 +0530)]
Merge pull request #44063 from vshankar/tr-52487
qa: wait for purge queue operations to finish
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Huber-ming [Thu, 2 Dec 2021 01:18:19 +0000 (09:18 +0800)]
radosgw-admin: fix error message of OPT::LC_RESHARD_FIX
Signed-off-by: Huber-ming <zhangsm01@inspur.com>
Sage Weil [Wed, 1 Dec 2021 22:18:06 +0000 (17:18 -0500)]
Merge PR #44125 into master
* refs/pull/44125/head:
qa/suites/rados/thrash-old-clients: use better-support cephadm distro/podman
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Sage Weil [Wed, 24 Nov 2021 18:22:26 +0000 (13:22 -0500)]
mgr: fix locking for MetadataUpdate::finish
We need to hold the DaemonState lock here since we are both reading and
writing its content.
Fixes: https://tracker.ceph.com/issues/53393
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Sat, 20 Nov 2021 14:53:36 +0000 (09:53 -0500)]
mgr/cephadm: less log noise when config checks fail
We are already raising health alerts--there is no need to spam the log
every few seconds when these checks are evaluated.
Signed-off-by: Sage Weil <sage@newdream.net>
Matt Benjamin [Tue, 30 Nov 2021 17:42:33 +0000 (12:42 -0500)]
rgwlc: optimize single-bucket lifecycle processing
Looks up the shard index of the corresponding bucket, and only
buckets in the corresponding shard are considered for processing.
This has a side effect of matching buckets by id, and also adds
support for --tenant.
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Neha Ojha [Wed, 1 Dec 2021 18:51:14 +0000 (10:51 -0800)]
Merge pull request #44161 from neha-ojha/wip-fix-upgrades
qa/suites/upgrade/octopus-x: bunch of fixes and cleanup
Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
Reviewed-by: Sridhar Seshasayee <sseshasa@redhat.com>
Sebastian Wagner [Wed, 1 Dec 2021 17:04:44 +0000 (18:04 +0100)]
Merge pull request #43149 from sebastian-philipp/cephadm-force-last-admin
mgr/cephadm: Add client.admin keyring when upgrading from older version
Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Sage Weil <sage@newdream.net>
Sebastian Wagner [Wed, 1 Dec 2021 16:57:28 +0000 (17:57 +0100)]
Merge pull request #44109 from sebastian-philipp/doc-crush-types
doc/cephadm: host location: add link to types
Reviewed-by: Michael Fritch <mfritch@suse.com>
Sebastian Wagner [Wed, 1 Dec 2021 14:47:36 +0000 (15:47 +0100)]
Merge pull request #44134 from liewegas/cephadm-device-enhanced-scan
mgr/cephadm: avoid repeated calls to get_module_option
Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Kefu Chai [Wed, 1 Dec 2021 14:29:33 +0000 (22:29 +0800)]
Merge pull request #44071 from tchaikov/wip-atomic-mips64
cmake: test for 16-byte atomic support on mips also
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Kefu Chai [Wed, 1 Dec 2021 14:15:17 +0000 (22:15 +0800)]
Merge pull request #43540 from fengchunsong/dpdk-test
test/msgr: remove DPDK Non-runtime configure items
Reviewed-by: Kefu Chai <tchaikov@gmail.com>
Matan Breizman [Wed, 24 Nov 2021 15:23:47 +0000 (15:23 +0000)]
doc/dev: Running workunits locally
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Venky Shankar [Wed, 1 Dec 2021 10:54:22 +0000 (16:24 +0530)]
Merge pull request #43981 from lxbsz/wip-53216
qa/cephfs: correct the parameters' order
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Omri Zeneva [Wed, 1 Dec 2021 10:17:08 +0000 (12:17 +0200)]
common/tracer: remove unnecessary opentelemetry includes
Signed-off-by: Omri Zeneva <ozeneva@redhat.com>
Omri Zeneva [Wed, 1 Dec 2021 10:04:52 +0000 (12:04 +0200)]
rgw: get attrs in AbortMultipart only if tracing is enabled
Signed-off-by: Omri Zeneva <ozeneva@redhat.com>
Sebastian Wagner [Wed, 1 Dec 2021 09:33:01 +0000 (10:33 +0100)]
Merge pull request #44087 from guits/guits-fix-cv-rootfs
ceph-volume: remove --root param from nsenter cmd
Reviewed-by: Sébastien Han <seb@redhat.com>
Sebastian Wagner [Wed, 1 Dec 2021 09:24:00 +0000 (10:24 +0100)]
Merge pull request #44143 from devlikai/master
doc/mgr/diskprediction: fix a typo.
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Venky Shankar [Wed, 1 Dec 2021 09:01:18 +0000 (14:31 +0530)]
Merge pull request #44116 from lxbsz/caps_doc1
doc: fix the style of the cephfs capability doc
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Jianpeng Ma [Wed, 1 Dec 2021 08:58:25 +0000 (16:58 +0800)]
librbd/cache/pwl: it should in apply_metadata set discard_granularity for pwl cache.
Function apply_meta can overwrite discard_granularity_bytes
based on option.
Fixes:https://tracker.ceph.com/issues/53434
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Sebastian Wagner [Wed, 1 Dec 2021 08:48:42 +0000 (09:48 +0100)]
Merge pull request #44129 from spdfnet/patch-1
doc: fix typo in cephadm host management
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Xiubo Li [Thu, 25 Nov 2021 08:27:05 +0000 (16:27 +0800)]
qa: correct the parameters' order
The parameters' order is incorrect and missing the client_config.
Introduced-by: 242585656c6bc282f5adbd073e83bafa86b5c0d2
Fixes: https://tracker.ceph.com/issues/53216
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Xiubo Li [Thu, 25 Nov 2021 08:29:21 +0000 (16:29 +0800)]
qa: move the optional client_config parameter to the end
Fixes: https://tracker.ceph.com/issues/53216
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Xiubo Li [Thu, 25 Nov 2021 08:30:12 +0000 (16:30 +0800)]
qa: rename and save the client_config for kernel mount
Fixes: https://tracker.ceph.com/issues/53216
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Venky Shankar [Tue, 23 Nov 2021 09:37:01 +0000 (04:37 -0500)]
qa: wait for purge queue operations to finish
TestFragmentation.test_deep_split relies on `num_strays`
to reach zero expecting that the purge threads would
have deleted the directory entries. However, checking
`num_strays` cannot be relied on since PurqeQueue merely
journals the purge item (see PurgeQueue::push) followed
by the StrayManager marking the stray as removed thereby
accounting `num_strays`.
So, add an additional condition to check if the purge
threads have finished processing items.
Fixes: http://tracker.ceph.com/issues/52487
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Venky Shankar [Wed, 1 Dec 2021 04:08:46 +0000 (09:38 +0530)]
Merge pull request #44038 from lxbsz/wip-53082
client: fix crash when iterating and deleting sessions
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Venky Shankar [Wed, 1 Dec 2021 04:08:05 +0000 (09:38 +0530)]
Merge pull request #43878 from jtlayton/wip-53214
qa: account for split of the kclient "metrics" debugfs file
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Venky Shankar [Wed, 1 Dec 2021 04:07:00 +0000 (09:37 +0530)]
Merge pull request #43850 from batrick/i53194
mds: defer messages to bootstrapping ranks
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Venky Shankar [Wed, 1 Dec 2021 04:04:54 +0000 (09:34 +0530)]
Merge pull request #43297 from yongseokoh/test-dir-max-entries
qa: add mds_dir_max_entries workunit test case
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Venky Shankar [Wed, 1 Dec 2021 04:03:29 +0000 (09:33 +0530)]
Merge pull request #41334 from vshankar/wip-kcephfs-new-mount-syntax
mount: introduce new device mount syntax
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Neha Ojha [Wed, 1 Dec 2021 02:46:21 +0000 (02:46 +0000)]
qa/*/octopus-x/stress-split-erasure-code-no-cephadm: set quincy flags
Signed-off-by: Neha Ojha <nojha@redhat.com>
Neha Ojha [Wed, 1 Dec 2021 01:22:46 +0000 (01:22 +0000)]
qa/suites/upgrade/octopus-x/stress-split-no-cephadm: remove msgr2
Signed-off-by: Neha Ojha <nojha@redhat.com>
Neha Ojha [Wed, 1 Dec 2021 01:15:14 +0000 (01:15 +0000)]
qa: test upgrades with hybrid allocator
Signed-off-by: Neha Ojha <nojha@redhat.com>
Neha Ojha [Wed, 1 Dec 2021 01:12:15 +0000 (01:12 +0000)]
qa: rename octopus install correctly
Signed-off-by: Neha Ojha <nojha@redhat.com>
Neha Ojha [Wed, 1 Dec 2021 00:39:57 +0000 (00:39 +0000)]
qa: remove leftovers from nautilus
pglog_hardlimit and msgr2
Signed-off-by: Neha Ojha <nojha@redhat.com>
Neha Ojha [Wed, 1 Dec 2021 00:31:48 +0000 (00:31 +0000)]
qa/suites/upgrade/octopus-x/stress-split-no-cephadm: set quincy flags
not pacific
Signed-off-by: Neha Ojha <nojha@redhat.com>
Sage Weil [Tue, 30 Nov 2021 23:18:31 +0000 (18:18 -0500)]
mgr: limit changes to pg_num
We need to avoid making drastic changes to pg_num that outpace pgp_num or
else we will may hit the per-osd pg limits.
Fixes: https://tracker.ceph.com/issues/53442
Signed-off-by: Sage Weil <sage@newdream.net>
Yehuda Sadeh [Tue, 30 Nov 2021 22:32:39 +0000 (14:32 -0800)]
Merge pull request #42710 from yehudasa/wip-rgw-mgr-module
mgr/rgw: new rgw manager module
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Sage Weil [Tue, 30 Nov 2021 20:19:55 +0000 (15:19 -0500)]
Merge PR #44140 into master
* refs/pull/44140/head:
python-common/ceph/deployment/drive_group: fix 'orch ls --format yaml'
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Yehuda Sadeh [Tue, 30 Nov 2021 16:45:06 +0000 (08:45 -0800)]
python-common/rgw: fix style issues
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Yehuda Sadeh [Mon, 29 Nov 2021 23:31:42 +0000 (15:31 -0800)]
doc/rgw: fix docs build
Workaround rgw modules conflict, as there are two separate modules named
rgw: src/pybind/rgw, src/pybind/mgr/rgw
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Ronen Friedman [Tue, 30 Nov 2021 17:43:13 +0000 (19:43 +0200)]
Merge pull request #44072 from ronen-fr/wip-rf-latescrub-count
qa/standalone: osd-scrub-repair.sh: fix expected "not scrubbed since"…
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Sridhar Seshasayee <sseshasa@redhat.com>
Sebastian Wagner [Tue, 30 Nov 2021 16:50:15 +0000 (17:50 +0100)]
Merge pull request #44118 from sebastian-philipp/cephadm-inventory-changed-while-iterated
mgr/cephadm: Inventory: Fix `dictionary changed size during iteration `
Reviewed-by: Michael Fritch <mfritch@suse.com>
Sage Weil [Sun, 28 Nov 2021 21:10:40 +0000 (15:10 -0600)]
qa/suites/rados/thrash-old-clients: use better-support cephadm distro/podman
Signed-off-by: Sage Weil <sage@newdream.net>
Ernesto Puerta [Tue, 30 Nov 2021 13:54:20 +0000 (14:54 +0100)]
Merge pull request #44115 from rhcs-dashboard/fix-tooltip-fast-diff
mgr/dashboard: avoid tooltip if disk_usage=null and fast-diff enabled
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Ernesto Puerta [Tue, 30 Nov 2021 13:53:59 +0000 (14:53 +0100)]
Merge pull request #44083 from wangbo-yw/wangbo-yw
mgr/dashboard: add some test for controllers/pool.py
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Ernesto Puerta [Tue, 30 Nov 2021 13:52:44 +0000 (14:52 +0100)]
Merge pull request #43855 from zhangmengqianyw/zmq-unittest
mgr/dashboard:add unittest in test_osd.py
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Sebastian Wagner [Tue, 30 Nov 2021 09:48:48 +0000 (10:48 +0100)]
Merge pull request #44082 from pcuzner/fix-prometheus-timings
mgr/prometheus: Fix the per method stats exported
Reviewed-by: Kefu Chai <kchai@redhat.com>
Omri Zeneva [Tue, 30 Nov 2021 09:05:50 +0000 (11:05 +0200)]
doc: update jaegertracing user doc
Signed-off-by: Omri Zeneva <ozeneva@redhat.com>
Omri Zeneva [Tue, 30 Nov 2021 09:05:34 +0000 (11:05 +0200)]
rgw: implement single trace for multipart upload ops
Signed-off-by: Omri Zeneva <ozeneva@redhat.com>
Omri Zeneva [Tue, 30 Nov 2021 09:05:02 +0000 (11:05 +0200)]
common/tracer: implement encode & decode functions
Signed-off-by: Omri Zeneva <ozeneva@redhat.com>
Kalpesh Pandya [Mon, 15 Nov 2021 07:34:17 +0000 (13:04 +0530)]
src/rgw: Empty configuration support
This PR solves: https://tracker.ceph.com/issues/53040
So, if we pass on empty configuration it should ideally
delete all the notifications.
Signed-off-by: Kalpesh Pandya <kapandya@redhat.com>
Kyle [Tue, 30 Nov 2021 07:27:26 +0000 (15:27 +0800)]
doc/mgr/diskprediction: fix a typo.
doc: remove extra comma.
This commit remove extra comma of "To disable prediction,:".
Fixes: https://tracker.ceph.com/issues/53433
Signed-off-by: devlikai <likai_lc@inspur.com>
Yuval Lifshitz [Tue, 30 Nov 2021 06:57:23 +0000 (08:57 +0200)]
Merge pull request #43529 from curtbruns/rgw-lua-storageclass
rgw/lua: allow read/write of StorageClass field
Yuval Lifshitz [Tue, 30 Nov 2021 06:56:26 +0000 (08:56 +0200)]
Merge pull request #42504 from arjune123/rgw-bug-fixes
rgw/notification: assigning the value of zonegroup to awsRegion