]>
git.apps.os.sepia.ceph.com Git - ceph-ci.git/log
Radoslaw Zarzynski [Thu, 16 Dec 2021 10:15:26 +0000 (10:15 +0000)]
crimson/osd: implement op discarding for pglog-based recovery.
crimson, in regards to the classical OSD, doesn't discard
`MOSDPGPush` nor `MOSDPGPull` messages that were sent in
a epoch earlier than `last_peering_reset`.
This was the problem behind the following crash observed
at Sepia:
```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-12-07_08:51:40-rados-master-distro-basic-smithi$ less ./
6550163 /remote/smithi190/log/ceph-osd.1.log.gz
...
DEBUG 2021-12-07 09:23:18,543 [shard 0] osd - handle_push: MOSDPGPush(2.1 32/29 {PushOp(2:
8ae28953 :::benchmark_data_smithi190_40039_object884:head, version: 19'102, data_included: [0~1], data_size: 1, omap_heade
r_size: 0, omap_entries_size: 0, attrset_size: 1, recovery_info: ObjectRecoveryInfo(2:
8ae28953 :::benchmark_data_smithi190_40039_object884:head@19'102, size: 1, copy_subset: [0~1], clone_subset: {}, snapset: 0={}
:{}, object_exist: 0), after_progress: ObjectRecoveryProgress(!first, data_recovered_to:1, data_complete:true, omap_recovered_to:, omap_complete:true, error:false), before_progress: ObjectRecoveryProgress(first,
data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:false, error:false))}) v4
DEBUG 2021-12-07 09:23:18,543 [shard 0] osd - _handle_push
DEBUG 2021-12-07 09:23:18,544 [shard 0] osd - submit_push_data
...
DEBUG 2021-12-07 09:23:18,545 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=33) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 crt=19'236 m
lcod 0'0 remapped NOTIFY last_complete now 19'101 log.complete_to at end
ERROR 2021-12-07 09:23:18,545 [shard 0] none - /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-89
04-g6dfda01c /rpm/el8/BUILD/
ceph-17.0.0-8904-g6dfda01c /src/osd/PeeringState.cc:4198 : In function 'void PeeringState::recover_got(const hobject_t&, eversion_t, bool, ObjectStore::Transaction&)', ceph_assert(%s)
info.last_complete == info.last_update
Aborting on shard 0.
Backtrace:
Reactor stalled for 1270 ms on shard 0. Backtrace: 0xb14ab 0x470b5f58 0x46e2303d 0x46e3eeed 0x46e3f2b2 0x46e3f476 0x46e3f726 0x12b1f 0xc8e3b 0x3ffd3682 0x3ffd8b7b 0x3ffda08e 0x3ffda753 0x3ffcfdcb 0x3ffd02e2 0x3f
fd0ada 0x12b1f 0x3737e 0x21db4 0x3feb09be 0x3cd7fc6e 0x3bec409f 0x3c13d963 0x3c250005 0x3c250b21 0x3c251436 0x3c251fa7 0x3a2cc920 0x3a2fa631 0x3a2fb2cd 0x46df4da1 0x46e3d04a 0x46fc744b 0x46fc9420 0x46a79302 0x46
a7db6b 0x3a18b7f2 0x23492 0x39d30edd
0# gsignal in /lib64/libc.so.6
1# abort in /lib64/libc.so.6
2# ceph::__ceph_assert_fail(char const*, char const*, int, char const*) in ceph-osd
3# PeeringState::recover_got(hobject_t const&, eversion_t, bool, ceph::os::Transaction&) in ceph-osd
4# PGRecovery::on_local_recover(hobject_t const&, ObjectRecoveryInfo const&, bool, ceph::os::Transaction&) in ceph-osd
```
The sequence of events
---------------------------
The merged log had entries to `19'236`:
```
DEBUG 2021-12-07 09:22:06,727 [shard 0] osd - pg_advance_map(id=74, detail=PGAdvanceMap(pg=2.1 from=23 to=24 do_init)): complete
TRACE 2021-12-07 09:22:06,729 [shard 0] osd - call_with_interruption_impl: may_interrupt: false, local interrupt_condintion: 0x603000699800, global interrupt_cond: 0x0,N7crimson3osd20IOInterruptConditionE
TRACE 2021-12-07 09:22:06,729 [shard 0] osd - set: interrupt_cond: 0x603000699800, ref_count: 1
DEBUG 2021-12-07 09:22:06,729 [shard 0] osd - do_peering_event handling epoch_sent: 24 epoch_requested: 23 MLogRec from 2 log log((0'0,19'236], crt=19'236) pi ([18,22] all_participants=0,1,2,3 intervals=([18,21] acting 2,3)) pg_lease(ru 89.867439270s ub 89.867439270s int 16.000000000s) +create_info for pg: 2.1
DEBUG 2021-12-07 09:22:06,729 [shard 0] osd - pg_epoch 24 pg[2.1( empty local-lis/les=0/0 n=0 ec=12/12 lis/c=18/18 les/c/f=19/19/0 sis=23) [0,1]/[2] r=-1 lpr=23 pi=[18,23)/1 crt=0'0 mlcod 0'0 unknown state<Started/Stray>: got info+log from osd.2 2.1( v 19'236 (0'0,19'236] local-lis/les=23/24 n=0 ec=12/12 lis/c=18/18 les/c/f=19/19/0 sis=23) log((0'0,19'236], crt=19'236)
DEBUG 2021-12-07 09:22:06,729 [shard 0] osd - merge_log log((0'0,19'236], crt=19'236) from osd.2 into log((0'0,0'0], crt=0'0)
DEBUG 2021-12-07 09:22:06,729 [shard 0] osd - merge_log extending head to 19'236
DEBUG 2021-12-07 09:22:06,729 [shard 0] osd - ? 19'236 (0'0) modify 2:
8a93bbd4 :::foo.7:head by client.4668.0:1 2021-12-07T09:21:57.301709+0000 0 ObjectCleanRegions clean_offsets: [9033~
18446744073709542582 ], clean_omap: 1, new_object: 0
...
DEBUG 2021-12-07 09:22:06,739 [shard 0] osd - ? 19'103 (0'0) modify 2:
9cf5b466 :::benchmark_data_smithi190_40039_object890:head by client.4423.0:891 2021-12-07T09:21:44.366732+0000 0 ObjectCleanRegions clean_
offsets: [1~
18446744073709551614 ], clean_omap: 1, new_object: 0
DEBUG 2021-12-07 09:22:06,739 [shard 0] osd - ? 19'102 (0'0) modify 2:
8ae28953 :::benchmark_data_smithi190_40039_object884:head by client.4423.0:885 2021-12-07T09:21:44.292066+0000 0 ObjectCleanRegions clean_offsets: [1~
18446744073709551614 ], clean_omap: 1, new_object: 0
DEBUG 2021-12-07 09:22:06,739 [shard 0] osd - ? 19'101 (0'0) modify 2:
80595656 :::benchmark_data_smithi190_40039_object883:head by client.4423.0:884 2021-12-07T09:21:44.285634+0000 0 ObjectCleanRegions clean_offsets: [1~
18446744073709551614 ], clean_omap: 1, new_object: 0
```
The PG log was `complete_to 19'102` when recovering previous object.
```
DEBUG 2021-12-07 09:23:18,394 [shard 0] osd - pg_epoch 32 pg[2.1( v 19'236 lc 19'100 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 c
rt=19'236 mlcod 0'0 active+remapped got missing 2:
80595656 :::benchmark_data_smithi190_40039_object883:head v 19'101
DEBUG 2021-12-07 09:23:18,394 [shard 0] osd - pg_epoch 32 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 c
rt=19'236 lcod 19'100 mlcod 0'0 active+remapped last_complete now 19'101 log.complete_to 19'102
```
```
DEBUG 2021-12-07 09:23:18,397 [shard 0] osd - pg_epoch 32 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 c
rt=19'236 lcod 19'100 mlcod 0'0 active+remapped recovery_committed_to version 19'101 now ondisk
```
Then `PGAdvance` event happened...
```
DEBUG 2021-12-07 09:23:18,468 [shard 0] osd - pg_advance_map(id=149, detail=PGAdvanceMap(pg=2.1 from=32 to=33)): start
```
... and the PG 2.1 went to `Reset`:
```
DEBUG 2021-12-07 09:23:18,477 [shard 0] osd - pg_epoch 32 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped handle_advance_map {1}/{2} -- 1/2
DEBUG 2021-12-07 09:23:18,477 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped state<Started>: Started advmap
DEBUG 2021-12-07 09:23:18,478 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped new interval newup {1} newacting {2}
DEBUG 2021-12-07 09:23:18,478 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped state<Started>: should_restart_peering, transitioning to Reset
INFO 2021-12-07 09:23:18,478 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped exit Started/ReplicaActive/RepRecovering 8.880699 5 0.000298
INFO 2021-12-07 09:23:18,478 [shard 0] osd - Exiting state: Started/ReplicaActive/RepRecovering, entered at
1638868989 .
5980496 , 0.
000298968 spent on 5 events
INFO 2021-12-07 09:23:18,478 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped exit Started/ReplicaActive 27.421174 0 0.000000
INFO 2021-12-07 09:23:18,478 [shard 0] osd - Exiting state: Started/ReplicaActive, entered at
1638868971 .057631, 0.0 spent on 0 events
INFO 2021-12-07 09:23:18,478 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped exit Started 28.590107 0 0.000000
INFO 2021-12-07 09:23:18,478 [shard 0] osd - Exiting state: Started, entered at
1638868969 .
8887343 , 0.0 spent on 0 events
INFO 2021-12-07 09:23:18,478 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped enter Reset
INFO 2021-12-07 09:23:18,478 [shard 0] osd - Entering state: Reset
DEBUG 2021-12-07 09:23:18,478 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped set_last_peering_reset 33
DEBUG 2021-12-07 09:23:18,478 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=33 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped Clearing blocked outgoing recovery messages
DEBUG 2021-12-07 09:23:18,478 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=33 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped Beginning to block outgoing recovery messages
DEBUG 2021-12-07 09:23:18,478 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=33 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped state<Reset>: Reset advmap
DEBUG 2021-12-07 09:23:18,479 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=33 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped new interval newup {1} newacting {2}
DEBUG 2021-12-07 09:23:18,479 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=33 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped state<Reset>: should restart peering, calling start_peering_interval again
DEBUG 2021-12-07 09:23:18,479 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=33 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped set_last_peering_reset 33
DEBUG 2021-12-07 09:23:18,479 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped start_peering_interval: check_new_interval output: check_new_interval interval(29-32 up {0, 1}(0) acting {2}(2)) up_thru 30 up_from 28 last_epoch_clean 19 interval(29-32 up {0, 1}(0) acting {2}(2) maybe_went_rw) : primary up 28-30 includes interval
DEBUG 2021-12-07 09:23:18,479 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped noting past ([18,32] all_participants=0,1,2,3 intervals=([26,28] acting 0,1),([29,32] acting 2))
DEBUG 2021-12-07 09:23:18,479 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=33) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped on_new_interval
DEBUG 2021-12-07 09:23:18,480 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=33) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped on_new_interval upacting_features 0x3f01cfbb7ffdffff from {2}+{1}
DEBUG 2021-12-07 09:23:18,480 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=33) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped on_new_interval checking missing set deletes flag. missing = missing(135 may_include_deletes = 1)
DEBUG 2021-12-07 09:23:18,480 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=33) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped init_hb_stamps now {}
DEBUG 2021-12-07 09:23:18,480 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=33) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped on_new_interval prior_readable_until_ub 0.000000000s (mnow 145.976531982s + 0.000000000s)
INFO 2021-12-07 09:23:18,480 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=33) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped start_peering_interval up {0, 1} -> {1}, acting {2} -> {2}, acting_primary 2 -> 2, up_primary 0 -> 1, role -1 -> -1, features acting
4540138303579357183 upacting
4540138303579357183
DEBUG 2021-12-07 09:23:18,480 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=33) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 crt=19'236 mlcod 0'0 remapped clear_primary_state
DEBUG 2021-12-07 09:23:18,480 [shard 0] osd - on_change, pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=33) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 crt=19'236 mlcod 0'0 remapped
DEBUG 2021-12-07 09:23:18,480 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=33) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 crt=19'236 mlcod 0'0 remapped NOTIFY check_recovery_sources no source osds () went down
```
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Radoslaw Zarzynski [Thu, 16 Dec 2021 09:57:24 +0000 (09:57 +0000)]
crimson/osd: make PG::can_discard_replica_op() reusable for RecoveryBackend.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Samuel Just [Mon, 8 Nov 2021 20:59:40 +0000 (12:59 -0800)]
Merge pull request #43754 from cyx1231st/wip-seastore-fix-journal-committed-to
crimson/os/seastore: fix ordered updates to JournalSegmentManager::committed_to
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Xuehan Xu <xxhdx1985126@gmail.com>
Sage Weil [Mon, 8 Nov 2021 19:43:25 +0000 (14:43 -0500)]
Merge PR #43827 into master
* refs/pull/43827/head:
qa/suites/orch/cephadm: add repave-all test case
mgr/cephadm/services/osd: less noisy
mgr/cephadm/services/osd: do not log ok-to-stop/safe-to-destroy failures
mgr/orchestrator: clean up 'orch osd rm status'
Reviewed-by: Adam King <adking@redhat.com>
Yuri Weinstein [Mon, 8 Nov 2021 17:51:27 +0000 (09:51 -0800)]
Merge pull request #43699 from sebastian-philipp/qa-rados-mgr-random-objectstore
qa/suites/rados/mgr: use only one objectstore instead of all
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Yuri Weinstein [Mon, 8 Nov 2021 17:50:33 +0000 (09:50 -0800)]
Merge pull request #43621 from ifed01/wip-ifed-fix-53011
os/bluestore: use proper prefix when removing undecodable Share Blob.
Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
Sebastian Wagner [Mon, 8 Nov 2021 16:15:55 +0000 (17:15 +0100)]
Merge pull request #43635 from adk3798/agent-responsiveness
mgr/cephadm: improve agent responsiveness
Reviewed-by: Daniel Pivonka <dpivonka@redhat.com>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Guillaume Abrioux [Mon, 8 Nov 2021 13:16:42 +0000 (14:16 +0100)]
Merge pull request #43574 from sabzco/ceph-volume-fix
ceph-volume: fix a typo causing AttributeError
Guillaume Abrioux [Mon, 8 Nov 2021 09:27:41 +0000 (10:27 +0100)]
Merge pull request #43679 from guits/cv_quick_update_tests
ceph-volume/tests: update setup_mixed_type playbook
Sage Weil [Sat, 6 Nov 2021 16:29:53 +0000 (12:29 -0400)]
Merge PR #43826 into master
* refs/pull/43826/head:
mgr/cephadm: allow zapping devices from other clusters
Reviewed-by: Adam King <adking@redhat.com>
Sage Weil [Fri, 5 Nov 2021 19:00:10 +0000 (15:00 -0400)]
qa/suites/orch/cephadm: add repave-all test case
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Fri, 5 Nov 2021 18:37:58 +0000 (14:37 -0400)]
mgr/cephadm/services/osd: less noisy
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Fri, 5 Nov 2021 18:37:47 +0000 (14:37 -0400)]
mgr/cephadm/services/osd: do not log ok-to-stop/safe-to-destroy failures
These failures are normal and expected; they should not pollute the log.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Fri, 5 Nov 2021 18:36:54 +0000 (14:36 -0400)]
mgr/cephadm: allow zapping devices from other clusters
This is the 99% of the devices that ever get zapped.
Fixes: b7782084ac9657be9b2da6ebd56b5029cf859225
Signed-off-by: Sage Weil <sage@newdream.net>
Neha Ojha [Fri, 5 Nov 2021 18:37:57 +0000 (11:37 -0700)]
Merge pull request #43814 from neha-ojha/wip-more-cv
qa/suites/upgrade/octopus-x/stress-split-no-cephadm: exclude ceph-volume
Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
Sage Weil [Fri, 5 Nov 2021 18:24:47 +0000 (14:24 -0400)]
mgr/orchestrator: clean up 'orch osd rm status'
Signed-off-by: Sage Weil <sage@newdream.net>
Ali Maredia [Fri, 5 Nov 2021 16:34:53 +0000 (12:34 -0400)]
Merge pull request #43808 from cbodley/wip-qa-rgw-java-master
qa/rgw: master branch targets ceph-master branch of java_s3tests
Sebastian Wagner [Fri, 5 Nov 2021 14:09:20 +0000 (15:09 +0100)]
Merge pull request #43807 from sebastian-philipp/osd_memory_target_autotune-true
doc/cephadm: Recommend osd_memory_target_autotune
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Sage Weil <sage@newdream.net>
Sebastian Wagner [Thu, 4 Nov 2021 15:49:21 +0000 (16:49 +0100)]
doc/cephadm: Recommend osd_memory_target_autotune
In case the cluster runs on hardware that is used exclusively for
Ceph, let's recommend `osd_memory_target_autotune`
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
Neha Ojha [Mon, 1 Nov 2021 23:55:22 +0000 (23:55 +0000)]
qa/suites/upgrade/octopus-x/stress-split-no-cephadm: exclude ceph-volume
To address failures like
```
Command failed on smithi096 with status 100: 'sudo DEBIAN_FRONTEND=noninteractive apt-get -y --force-yes -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold" install ceph=
15.2.15-11-g5f8f263c -1focal ceph-mds=
15.2.15-11-g5f8f263c -1focal ceph-mgr=
15.2.15-11-g5f8f263c -1focal ceph-common=
15.2.15-11-g5f8f263c -1focal ceph-fuse=
15.2.15-11-g5f8f263c -1focal ceph-test=
15.2.15-11-g5f8f263c -1focal ceph-volume=
15.2.15-11-g5f8f263c -1focal radosgw=
15.2.15-11-g5f8f263c -1focal python3-rados=
15.2.15-11-g5f8f263c -1focal python3-rgw=
15.2.15-11-g5f8f263c -1focal python3-cephfs=
15.2.15-11-g5f8f263c -1focal python3-rbd=
15.2.15-11-g5f8f263c -1focal libcephfs2=
15.2.15-11-g5f8f263c -1focal librados2=
15.2.15-11-g5f8f263c -1focal librbd1=
15.2.15-11-g5f8f263c -1focal rbd-fuse=
15.2.15-11-g5f8f263c -1focal'
```
Signed-off-by: Neha Ojha <nojha@redhat.com>
Neha Ojha [Thu, 4 Nov 2021 21:27:48 +0000 (14:27 -0700)]
Merge pull request #43406 from ljflores/wip-telemetry-perf-improvements
mgr/telemetry: add mempool stats to telemetry perf report
Reviewed-by: Yaarit Hatuka <yaarit@redhat.com>
Sage Weil [Thu, 4 Nov 2021 21:14:19 +0000 (17:14 -0400)]
Merge PR #42727 into master
* refs/pull/42727/head:
mgr/orchestrator: improve usage string for 'orch daemon add osd'
ceph-volume: activate: try simple mode too
mgr/cephadm: identify and instantiate raw osds post-create
mgr/orchestrator: accept --method arg to 'orch daemon add osd'
python-common: drivegroup: add 'method' property
cephadm: use generic ceph-volume activate
ceph-volume: top-level 'activate' command
ceph-volume: lvm activate: add --no-tmpfs
ceph-volume: lvm activate: infer bluestore or filestore
ceph-volume: raw activate: accept --osd-id and/or --osd-uuid instead of device
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>
Yuri Weinstein [Thu, 4 Nov 2021 21:10:57 +0000 (14:10 -0700)]
Merge pull request #43705 from tchaikov/wip-no-more-python2
mgr: do not handle Python2
Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
Yuri Weinstein [Thu, 4 Nov 2021 21:08:07 +0000 (14:08 -0700)]
Merge pull request #43700 from liewegas/fix-24990
ceph_test_rados_api_watch_notify: extend Watch3Timeout test
Reviewed-by: Neha Ojha <nojha@redhat.com>
Yuri Weinstein [Thu, 4 Nov 2021 21:07:35 +0000 (14:07 -0700)]
Merge pull request #43664 from NUABO/tanchangzhi
osd: fix 'ceph osd stop <osd.nnn>' doesn't take effect
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Patrick Donnelly [Thu, 4 Nov 2021 20:55:43 +0000 (16:55 -0400)]
Merge PR #43752 into master
* refs/pull/43752/head:
client: remove usless _openat()
client: remove optional for dirfd parameter
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Patrick Donnelly [Thu, 4 Nov 2021 20:54:14 +0000 (16:54 -0400)]
Merge PR #43666 into master
* refs/pull/43666/head:
qa/vstart_runner: add "managers" to LocalContext instances
Reviewed-by: Jos Collin <jcollin@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Thu, 4 Nov 2021 20:53:20 +0000 (16:53 -0400)]
Merge PR #43638 into master
* refs/pull/43638/head:
qa: pass subdir arg when executing workunit
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Thu, 4 Nov 2021 20:52:23 +0000 (16:52 -0400)]
Merge PR #43613 into master
* refs/pull/43613/head:
qa: lengthen health warning wait
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Patrick Donnelly [Thu, 4 Nov 2021 20:51:11 +0000 (16:51 -0400)]
Merge PR #41667 into master
* refs/pull/41667/head:
mds: do not trim cache when creating system file
mds: fix the comment in add_inode
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Sage Weil [Thu, 4 Nov 2021 18:33:45 +0000 (14:33 -0400)]
Merge PR #43611 into master
* refs/pull/43611/head:
doc/mgr/nfs: document rgw user and bucket exports
PendingReleaseNotes: add note about nfs CLI change(s)
qa/suites/orch/cephadm/smoke-roleless: add rgw user nfs export case
mgr/nfs: take user-id and/or bucket for 'nfs export create rgw'
mgr/nfs: reorder 'nfs export creage rgw' arguments
mgr/nfs: reorder 'nfs export create cephfs' arguments
mgr/nfs: use keyword args for 'nfs export create rgw'
mgr/nfs: document and use keyword args for 'nfs export create cephfs'
qa/tasks/cephfs/test_nfs: use keyword args
pybind/ceph_argparse: handle misordered keyword arguments
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
Daniel Gryniewicz [Thu, 4 Nov 2021 17:19:41 +0000 (13:19 -0400)]
Merge pull request #43768 from Huber-ming/admin_mdlog_fetch
radosgw-admin: supplement help documents with 'mdlog autotrim'
Casey Bodley [Thu, 4 Nov 2021 16:56:26 +0000 (12:56 -0400)]
Merge pull request #43778 from adamemerson/wip-53132
rgw: Ensure buckets too old to decode a layout have layout logs
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Casey Bodley [Thu, 4 Nov 2021 16:12:04 +0000 (12:12 -0400)]
qa/rgw: master branch targets ceph-master branch of java_s3tests
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Casey Bodley [Thu, 4 Nov 2021 16:07:05 +0000 (12:07 -0400)]
Merge pull request #37184 from ybwang0211/KMSMSMSMS_return_error_message
rgw:When KMS encryption is used and the key does not exist, we should…
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
Ernesto Puerta [Thu, 4 Nov 2021 16:00:13 +0000 (17:00 +0100)]
Merge pull request #43725 from rhcs-dashboard/nfs-export-form-fix
mgr/dashboard: NFS 'create export' form: fixes
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Sage Weil [Thu, 4 Nov 2021 14:41:44 +0000 (10:41 -0400)]
doc/mgr/nfs: document rgw user and bucket exports
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Mon, 25 Oct 2021 19:52:28 +0000 (15:52 -0400)]
PendingReleaseNotes: add note about nfs CLI change(s)
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Mon, 25 Oct 2021 14:53:56 +0000 (10:53 -0400)]
qa/suites/orch/cephadm/smoke-roleless: add rgw user nfs export case
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 20 Oct 2021 21:33:27 +0000 (17:33 -0400)]
mgr/nfs: take user-id and/or bucket for 'nfs export create rgw'
- move the bucket / user position after the cluster_id and pseudo_path
(since they are optional)
- require bucket or user or both
- if bucket, use the bucket owner
- if bucket+user, use that user
- if user only, then export at top-level (all users buckets)
Fixes: https://tracker.ceph.com/issues/53134
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 4 Nov 2021 14:07:14 +0000 (10:07 -0400)]
mgr/orchestrator: improve usage string for 'orch daemon add osd'
Signed-off-by: Sage Weil <sage@newdream.net>
Alfonso Martínez [Thu, 4 Nov 2021 13:56:37 +0000 (14:56 +0100)]
mgr/dashboard: NFS 'create export' form: fixes
* Do not allow a pseudo that is already in use by another export.
* Create mode form: prefill dropdown selectors if options > 0.
* Edit mode form: do not reset the field values that depend on other values that are being edited (unlike Create mode).
* Fix broken link: cluster service.
* Fix error message style for non-existent cephfs path.
* nfs-service.ts: lsDir: thow error if volume is not provided.
* File renaming: nfsganesha.py => nfs.py; test_ganesha.py => test_nfs.py
Fixes: https://tracker.ceph.com/issues/53083
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
Sebastian Wagner [Thu, 4 Nov 2021 09:45:56 +0000 (10:45 +0100)]
Merge pull request #43737 from AndrewSharapov/master
mgr/cephadm: Fixed spawning ip addresses list for public network interface
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Ernesto Puerta [Thu, 4 Nov 2021 09:17:57 +0000 (10:17 +0100)]
Merge pull request #43797 from rhcs-dashboard/fix-53144-master
mgr/dashboard: fix missing alert rule details
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Deepika Upadhyay [Thu, 4 Nov 2021 07:45:42 +0000 (13:15 +0530)]
Merge pull request #43461 from CongMinYin/fix-flush-advance
librbd/cache/pwl: fix external flush dispatch in advance
Reviewed-by: Mykola Golub <mgolub@suse.com>
Yingxin Cheng [Mon, 1 Nov 2021 08:28:59 +0000 (16:28 +0800)]
crimson/os/seastore: fix ordered updates to JournalSegmentManager::committed_to
Journal segment should not update committed_to during rolling as there
might be still pending writes from the previous segment.
A side-effect here is that committed_to now needs to include
segment_seq_t to point to a previous segment.
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
Samuel Just [Thu, 4 Nov 2021 04:08:43 +0000 (21:08 -0700)]
Merge pull request #43617 from cyx1231st/wip-seastore-batch-journal-records
crimson/os/seastore/journal: support both batching and concurrent writes
Reviewed-by: Samuel Just <sjust@redhat.com>
Samuel Just [Thu, 4 Nov 2021 03:27:59 +0000 (20:27 -0700)]
Merge pull request #43781 from liu-chunmei/osd_uuid_zero
crimson/osd: randomize the osd_uuid if not specified
Reviewed-by: Kefu Chai <tchaikov@gmail.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
chunmei-liu [Wed, 3 Nov 2021 22:03:24 +0000 (15:03 -0700)]
crimson/osd: randomize the osd_uuid if not specified
address the failure spotted in teuthology based test:
sudo ceph --cluster ceph osd new
00000000 -0000-0000-0000-
000000000000 0
Signed-off-by: chunmei-liu <chunmei.liu@intel.com>
Huber-ming [Thu, 4 Nov 2021 01:25:37 +0000 (09:25 +0800)]
radosgw-admin: supplement help documents with 'mdlog autotrim'
Signed-off-by: Huber-ming <zhangsm01@inspur.com>
Neha Ojha [Wed, 3 Nov 2021 23:14:14 +0000 (16:14 -0700)]
Merge pull request #43456 from ljflores/wip-separate-data
mgr/telemetry: provide option for separated data in the telemetry perf channel
Reviewed-by: Yaarit Hatuka <yaarit@redhat.com>
Casey Bodley [Wed, 3 Nov 2021 21:04:03 +0000 (17:04 -0400)]
Merge pull request #43591 from cbodley/wip-52976
radosgw-admin: allow 'bi purge' to delete index if entrypoint doesn't exist
Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
Casey Bodley [Wed, 3 Nov 2021 19:28:43 +0000 (15:28 -0400)]
Merge pull request #43710 from cbodley/wip-53003
rgw: fix self-comparison for RGWCopyObj optimization
Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
Casey Bodley [Wed, 3 Nov 2021 19:28:25 +0000 (15:28 -0400)]
Merge pull request #43715 from cbodley/wip-52716
rgw: ListMultipartUploads returns the real upload Owners
Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
Casey Bodley [Wed, 3 Nov 2021 19:27:56 +0000 (15:27 -0400)]
Merge pull request #43779 from cbodley/wip-47527
rgw: fix ListBucketMultiparts response with common prefixes
Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
Adam C. Emerson [Tue, 2 Nov 2021 16:46:15 +0000 (12:46 -0400)]
rgw: Ensure buckets too old to decode a layout have layout logs
When decoding `RGWBucketInfo` data from before Pacific, we won't call
`rgw::BucketLayout::decode`, but will instead synthesize the layout
information. This leaves the `rgw::BucketLayout::logs` empty, as the
fallback to populate it only applies to old versions of
`rgw::BucketLayout`.
Add a check at the end of `RGWBUcketInfo::decode` to populate it if
empty.
Fixes: https://tracker.ceph.com/issues/53132
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Deepika Upadhyay [Wed, 3 Nov 2021 18:25:31 +0000 (23:55 +0530)]
Merge pull request #42950 from CongMinYin/fix-dead-lock-during-shutdown
librbd/cache/pwl/ssd: fix dead lock and assert during shutdown
Reviewed-by: Mykola Golub <mykola.golub@clyso.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Ernesto Puerta [Wed, 3 Nov 2021 17:57:53 +0000 (18:57 +0100)]
mgr/dashboard: fix missing alert rule details
Fixes: https://tracker.ceph.com/issues/53144
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
Casey Bodley [Wed, 3 Nov 2021 16:05:22 +0000 (12:05 -0400)]
Merge pull request #43753 from soumyakoduri/wip-skoduri-dblock
rgw/dbstore: No need for explicit LOCK in DBStore
Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
Sebastian Wagner [Wed, 3 Nov 2021 15:27:53 +0000 (16:27 +0100)]
Merge pull request #43790 from sebastian-philipp/doc-cephadm-purge
doc/cephadm: purge
Reviewed-by: Michael Fritch <mfritch@suse.com>
Deepika Upadhyay [Wed, 3 Nov 2021 14:55:21 +0000 (20:25 +0530)]
Merge pull request #43659 from majianpeng/send-internal-flush-for-rbd-copy
librbd: send FLUSH_SOURCE_INTERNAL when do copy/deep_copy.
Reviewed-by: Mykola Golub <mykola.golub@clyso.com>
Reviewed-by: Sunny Kumar <sunkumar@redhat.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Sebastian Wagner [Wed, 3 Nov 2021 13:11:00 +0000 (14:11 +0100)]
doc/cephadm: purge
Fixes: https://tracker.ceph.com/issues/50534
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
Patrick Donnelly [Wed, 3 Nov 2021 13:40:55 +0000 (09:40 -0400)]
Merge PR #43786 into master
* refs/pull/43786/head:
mds: fix typo in MDSRank.cc
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Ernesto Puerta [Wed, 3 Nov 2021 13:08:32 +0000 (14:08 +0100)]
Merge pull request #43690 from rhcs-dashboard/improve-error-handling-get-facts-backend
mgr/dashboard: improve error handling for gather_facts
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Alfonso Martínez [Wed, 3 Nov 2021 09:17:21 +0000 (10:17 +0100)]
mgr/dashboard: python unit tests refactoring
* Controller tests: cherrypy config: authentication disabled by default; ability to pass custom config (e.g. enable authentication).
* Auth controller: add tests; test that unauthorized request fails when authentication is enabled.
* DocsTest: clear ENDPOINT_MAP so the test_gen_tags test becomes deterministic.
* pylint: disable=no-name-in-module: fix imports in tests.
Fixes: https://tracker.ceph.com/issues/53083
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
Yingxin Cheng [Mon, 25 Oct 2021 08:40:59 +0000 (16:40 +0800)]
crimson/os/seastore/journal: measure io-depth and batching
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
Yingxin Cheng [Thu, 21 Oct 2021 08:01:46 +0000 (16:01 +0800)]
crimson/os/seastore/journal: implement RecordSubmitter and RecordBatch
To be able to batch records for write, meanwhile, still allows
concurrent writes.
The current change doesn't seem to impact write performance either to
prefer batching or concurrent writes, so set the io_depth to 2 for
demonstration.
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
Yongseok Oh [Wed, 3 Nov 2021 06:55:31 +0000 (06:55 +0000)]
mds: fix typo in MDSRank.cc
Signed-off-by: Yongseok Oh <yongseok.oh@linecorp.com>
zdover23 [Wed, 3 Nov 2021 03:04:35 +0000 (13:04 +1000)]
Merge pull request #43750 from zdover23/wip-doc-2021-10-30-omap-format-conversion-data-corruption-bug-admonition
doc: add admonition for tracker 53062
Reviewed-by: Laura Flores <lflores@redhat.com>
Neha Ojha [Tue, 2 Nov 2021 23:39:03 +0000 (16:39 -0700)]
Merge pull request #43769 from ifed01/wip-ifed-omap-upgrade-fix-notes
PendingReleaseNotes: document OMAP upgrade bug.
Reviewed-by: Neha Ojha <nojha@redhat.com>
Sage Weil [Wed, 20 Oct 2021 19:39:36 +0000 (15:39 -0400)]
mgr/nfs: reorder 'nfs export creage rgw' arguments
Put bucket name last so that it paves the way for an optional 'user' arg
to go along with it.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 20 Oct 2021 19:39:03 +0000 (15:39 -0400)]
mgr/nfs: reorder 'nfs export create cephfs' arguments
Put fsname after cluster_id + pseudo_path so that it aligns with the change
to the rgw command.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 20 Oct 2021 19:38:27 +0000 (15:38 -0400)]
mgr/nfs: use keyword args for 'nfs export create rgw'
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 20 Oct 2021 19:35:49 +0000 (15:35 -0400)]
mgr/nfs: document and use keyword args for 'nfs export create cephfs'
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 2 Nov 2021 16:38:05 +0000 (12:38 -0400)]
qa/tasks/cephfs/test_nfs: use keyword args
Signed-off-by: Sage Weil <sage@newdream.net>
Igor Fedotov [Tue, 2 Nov 2021 11:54:55 +0000 (14:54 +0300)]
PendingReleaseNotes: document OMAP upgrade bug.
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
Casey Bodley [Tue, 2 Nov 2021 18:18:31 +0000 (14:18 -0400)]
rgw: fix ListBucketMultiparts response with common prefixes
see documentation and examples in https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListMultipartUploads.html
that use Prefix directly
Fixes: https://tracker.ceph.com/issues/47527
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Sage Weil [Tue, 2 Nov 2021 13:41:49 +0000 (09:41 -0400)]
pybind/ceph_argparse: handle misordered keyword arguments
Signed-off-by: Sage Weil <sage@newdream.net>
Sebastian Wagner [Tue, 2 Nov 2021 16:07:42 +0000 (17:07 +0100)]
Merge pull request #43762 from sebastian-philipp/doc-cephadm-ceph-monstore-tool
doc/cephadm: Calling miscellaneous ceph tools
Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 2 Nov 2021 15:41:53 +0000 (11:41 -0400)]
Merge PR #42762 into master
* refs/pull/42762/head:
ceph_test_objectstore: skip BlueStoreUnshareBlobTest with SMR
os/bluestore: debug ExtentMap::update()
os/bluestore: _txc_create inside of alloc_and_submit_lock
os/bluestore: fix cleaner race with collection removal
os/bluestore: add missing ' ' to LruOnodeCacheShare _[un]pin
os/bluestore: use simpler map<> to track (onode, zone) -> offset
os/bluestore: avoid casting zoned implementations again
os/bluestore/ZonedFreelistManager: remove sanity checks
os/bluestore/ZonedAllocator: fix allocate() search
os/bluestore: drain transactions on cleaner zone finish
os/bluestore/ZonedFreelistManager: simplify freelist merge update vs zone reset
os/bluetore: configurable sleep period for cleaner
blk/zoned: make discard a no-op
os/bluestore/ZonedAllocator: count sequential only as 'free'
os/bluestore: expect smr fields IFF device is smr
ceph_test_objectstore: Test for fixing write pointer
ceph_test_objectstore: complain if SMR support not compiled in
test/objectstore/run_smr_bluestore_test.sh
os/bluestore/ZonedAllocator: handle alloc/release spanning zones
os/bluestore: simple cleaner
os/bluestore: be smarter about picking a zone to clean
os/bluestore: avoid writes to cleaning zone
os/bluestore/HybridAllocator: whitespace in debug output
os/bluestore: give conventional region of SMR to bluefs
os/bluestore: separate alloc pointer from shared_alloc.a
test/objectstore/run_smr_bluestore_test.sh
ceph_test_objectstore: skip tests that don't work on SMR
os/bluestore: disable cleaner thread until it is implemented
os/bluestore: fsck verify zone refs
os/bluestore: include object in zone ref keys
os/bluestore: refactor object key helpers a bit
ceph_test_objectstore: skip failing tests on SMR
os/bluestore: report mismatch write pointer during fsck
os/bluestore: simplify zone to clean selection
ceph_test_objectstore: add trivial fsck test
os/bluestore: fsck smr allocations (verify num_dead_bytes, alloc past write pointer)
os/bluestore: duplicate zone refs when cloning
os/bluestore: correct zoned freelist when device write pointers are ahead
os/bluestore/ZonedFreelistManager: whitespace
os/bluestore: fix startup vs device write pointers
blk/zoned: add get_zones() to fetch write pointers
os/bluestore: use 64 bit values for zone_state_t
os/bluestore: reimplement zone backrefs
os/bluestore: fix smr allocator init
os/bluestore: do not use null freelist with SMR
blk/zones: implement HMSMRDevice has KernelDevice child
os/bluestore: fix/simplify zoned_cleaner thread start error handling
os/bluestore: properly reset zoned allocator on startup
os/bluestore: force prefer_deferred_size=0 for smr
os/bluestore: drop SMR 64K min_alloc_size restriction
os/bluestore/ZonedAllocator: less verbose
os/bluestore/ZonedAllocator: simplify debug output prefix
os/bluestore/ZonedAllocator: be consistent with hex debug output
os/bluestore/ZonedAllocator: whitespace
blk/zoned: remove dead VDO code
blk/zoned: add reset_all_zones()
blk/zoned: print error during init
os/bluestore: adjust allocator+freelist interfaces for smr params
os/bluestore: select 'zoned' freelistmanager during mkfs, not mount
Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Sage Weil [Thu, 12 Aug 2021 15:12:59 +0000 (11:12 -0400)]
ceph-volume: activate: try simple mode too
This is of dubious value to cephadm since /etc/ceph/osd/* won't be
populated inside of a conatiner. However, it makes sense from a purely
ceph-volume perspective.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 3 Aug 2021 18:36:56 +0000 (14:36 -0400)]
mgr/cephadm: identify and instantiate raw osds post-create
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 3 Aug 2021 18:36:39 +0000 (14:36 -0400)]
mgr/orchestrator: accept --method arg to 'orch daemon add osd'
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 3 Aug 2021 18:35:27 +0000 (14:35 -0400)]
python-common: drivegroup: add 'method' property
The DriveGroup method can be none, 'raw', or 'lvm'.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 5 Aug 2021 17:32:00 +0000 (13:32 -0400)]
cephadm: use generic ceph-volume activate
This allows us to activate raw or LVM osds. (In fact, LVM osds often
activate via the raw method because the LVs are already available.)
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 5 Aug 2021 17:29:17 +0000 (13:29 -0400)]
ceph-volume: top-level 'activate' command
First try raw, then lvm.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 5 Aug 2021 17:23:27 +0000 (13:23 -0400)]
ceph-volume: lvm activate: add --no-tmpfs
This isn't necessary for cephadm, but having this arg match raw activate
makes the interface more consistent.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 5 Aug 2021 16:02:22 +0000 (12:02 -0400)]
ceph-volume: lvm activate: infer bluestore or filestore
No need to require --filestore and/or --bluestore args since we can tell
from the LV tags which one it is.
We can't drop the arguments without breaking existing users, though, so
redefine them to mean *force* bluesetore or filestore activation (even
though this will error out if the tags don't match).
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 3 Aug 2021 18:34:54 +0000 (14:34 -0400)]
ceph-volume: raw activate: accept --osd-id and/or --osd-uuid instead of device
This makes it possible to start raw osds based on their uuid/id instead of
device name (which may not be stable).
Signed-off-by: Sage Weil <sage@newdream.net>
Sebastian Wagner [Mon, 1 Nov 2021 21:37:55 +0000 (22:37 +0100)]
doc/cephadm: Calling miscellaneous ceph tools
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
Yuval Lifshitz [Tue, 2 Nov 2021 12:16:44 +0000 (14:16 +0200)]
Merge pull request #43626 from curtbruns/rgw_example
rgw/lua: Example read/write of StorageClass field
Kefu Chai [Tue, 2 Nov 2021 11:32:41 +0000 (19:32 +0800)]
Merge pull request #43765 from inspur-wyq/wip-doc-4
doc/rbd/rbd-mirroring.rst: fix typos
Reviewed-by: Kefu Chai <tchaikov@gmail.com>
Sebastian Wagner [Tue, 2 Nov 2021 08:50:35 +0000 (09:50 +0100)]
Merge pull request #43628 from pcuzner/cephadm-remove-zram-devices
cephadm: exclude zram and cdrom from device list
Reviewed-by: Michael Fritch <mfritch@suse.com>
Kefu Chai [Tue, 2 Nov 2021 02:09:07 +0000 (10:09 +0800)]
Merge pull request #43766 from inspur-wyq/wip-doc2
doc/radosgw/s3-notification-compatibility.rst: fix typos
Reviewed-by: Kefu Chai <tchaikov@gmail.com>
Yin Congmin [Wed, 15 Sep 2021 11:23:43 +0000 (11:23 +0000)]
librbd/cache/pwl: fix assert in _aio_stop() during shutdown
For wait_for_ops(next_ctx). this next_ctx may run in aio_thread.
Then the next program runs on the aio thread. remove_pool_file()
calls bdev->close(), then calles _aio_stop(), exec aio_thread.join(),
cause assert. Thread can't join itself. Fix it by adding close ctx
to m_work_queue, so close() can run in work queue thread.
At the same time, correct the order of wait_for_ops().
flush_dirty_entries(next_ctx) may call wake_up() and start_op().
so moving wait_for_ops() behind flush_dirty_entries(next_ctx) is more
appropriate.
Fixes: https://tracker.ceph.com/issues/52566
Signed-off-by: Yin Congmin <congmin.yin@intel.com>
wangyunqing [Wed, 22 Sep 2021 03:05:40 +0000 (11:05 +0800)]
doc/radosgw/s3-notification-compatibility.rst: fix typos
Signed-off-by: wangyunqing <wangyunqing@inspur.com>
Xiubo Li [Mon, 1 Nov 2021 06:08:04 +0000 (14:08 +0800)]
client: remove usless _openat()
The _openat() is never used and I believe this also was introduced
when coding previous interim patches and forgot to remove it.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Xiubo Li [Mon, 1 Nov 2021 02:57:16 +0000 (10:57 +0800)]
client: remove optional for dirfd parameter
All the callers when calling the create_and_open() there will always
be with a dirfd. I beleive this was introduced when coding previous
patches temporarily, but forgot to remove or fix it when pushing it.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Neha Ojha [Tue, 2 Nov 2021 01:12:52 +0000 (18:12 -0700)]
Merge pull request #43742 from ljflores/wip-teuthology-subset
doc/dev/developer_guide/testing_integration_tests: update "frequently used options"
Reviewed-by: Neha Ojha <nojha@redhat.com>
Laura Flores [Tue, 2 Nov 2021 00:31:16 +0000 (00:31 +0000)]
doc/dev/developer_guide/testing_integration_tests: update "frequently used options"
The `subset` option is important in Teuthology runs for reducing the number of tests that are triggered. This option is outlined in another part of the Teuthology documentation, but I think it's important to place here as well.
Also, -n (for how many times the job will run) is incorrect; it should be -N.
Signed-off-by: Laura Flores <lflores@redhat.com>