]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
Radoslaw Zarzynski [Wed, 2 Jun 2021 11:59:37 +0000 (11:59 +0000)]
crimson/monc: fix subscription stall that blocked peering.
There is a scenario when the `active_con` is properly
chosen but isn't marked as `ready_to_send`.
If `renew_subs()` is called during the `on_session_opened()`,
the flag will be turned on after the subscriptions are
renewed which cannot happen as it requires the flag to be
already set. In other words: there is a circular data dependency.
The net result is stalling the subscription machinery,
particularly the `OSDMap` subs. This caused a nasty peering
issue at Sepia [1] where PG 2.7 got stuck in the `GetInfo`
state.
```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-05-26_12:20:26-rados-master-distro-basic-smithi/
6136908 $ less ./remote/smithi039/log/ceph-osd.1.log.gz
...
DEBUG 2021-05-26 20:19:48,134 [shard 0] osd - pg_epoch 14 pg[2.7( DNE empty local-lis/les=0/0 n=0 ec=0/0 lis/c=0/0 les/c/f=0/0/0 sis=0) [] r=
-1 lpr=0 crt=0'0 mlcod 0'0 unknown enter Initial
...
DEBUG 2021-05-26 20:19:48,138 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0]
r=0 lpr=0 crt=0'0 mlcod 0'0 unknown enter Reset
...
DEBUG 2021-05-26 20:19:48,138 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 unknown enter Started
...
DEBUG 2021-05-26 20:19:48,138 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 unknown enter Start
...
DEBUG 2021-05-26 20:19:48,138 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 unknown enter Started/Primary
...
DEBUG 2021-05-26 20:19:48,138 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating enter Started/Primary/Peering
...
DEBUG 2021-05-26 20:19:48,138 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering enter Started/Primary/Peering/GetInfo
DEBUG 2021-05-26 20:19:48,138 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering build_prior all_probe
DEBUG 2021-05-26 20:19:48,139 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering build_prior final: probe 0,1 down blocked_by {}
DEBUG 2021-05-26 20:19:48,139 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering up_thru 0 < same_since 14, must notify monitor
DEBUG 2021-05-26 20:19:48,139 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering state<Started/Primary/Peering/GetInfo>: no prior_set down osds, clearing prior_readable_until_ub
DEBUG 2021-05-26 20:19:48,139 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering state<Started/Primary/Peering/GetInfo>: querying info from osd.0
...
DEBUG 2021-05-26 20:19:48,237 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering got osd.0 2.7( DNE empty local-lis/les=0/0 n=0 ec=0/0 lis/c=0/0 les/c/f=0/0/0 sis=0)
DEBUG 2021-05-26 20:19:48,237 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering state<Started/Primary/Peering/GetInfo>: Adding osd: 0 peer features:
3f01cfbb7ffdffff
DEBUG 2021-05-26 20:19:48,237 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering state<Started/Primary/Peering/GetInfo>: Common peer features:
3f01cfbb7ffdffff
DEBUG 2021-05-26 20:19:48,237 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering state<Started/Primary/Peering/GetInfo>: Common acting features:
3f01cfbb7ffdffff
DEBUG 2021-05-26 20:19:48,238 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering state<Started/Primary/Peering/GetInfo>: Common upacting features:
3f01cfbb7ffdffff
DEBUG 2021-05-26 20:19:48,238 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering exit Started/Primary/Peering/GetInfo 0.099480 4 2021-05-26T20:19:48.146172+0000
...
DEBUG 2021-05-26 20:19:48,238 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering enter Started/Primary/Peering/GetLog
...
DEBUG 2021-05-26 20:19:48,238 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering enter Started/Primary/Peering/GetMissing
...
DEBUG 2021-05-26 20:19:48,238 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering enter Started/Primary/Peering/WaitUpThru
...
DEBUG 2021-05-26 20:19:49,139 [shard 0] osd - pg_epoch 15 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating enter Started/Primary/Active
...
DEBUG 2021-05-26 20:19:49,142 [shard 0] osd - pg_epoch 15 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+activating enter Started/Primary/Active/Activating
...
DEBUG 2021-05-26 20:19:49,204 [shard 0] osd - pg_epoch 15 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/0 les/c/f=15/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 active enter Started/Primary/Active/Recovered
...
DEBUG 2021-05-26 20:19:49,204 [shard 0] osd - pg_epoch 15 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/0 les/c/f=15/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 active enter Started/Primary/Active/Clean
...
DEBUG 2021-05-26 20:22:31,223 [shard 0] osd - pg_epoch 86 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 active enter Reset
...
<a lot of flipping>
...
DEBUG 2021-05-26 20:24:07,851 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163
) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown activate_map
DEBUG 2021-05-26 20:24:07,851 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163
) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown exit Reset 0.035744 1 2021-05-26T20:24:07.817331+0000
INFO 2021-05-26 20:24:07,851 [shard 0] osd - Exiting state: Reset, entered at
1622060647 .
8158188 ,
1622060647 .
8173316 spent on 1 events
DEBUG 2021-05-26 20:24:07,851 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163
) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown enter Started
INFO 2021-05-26 20:24:07,851 [shard 0] osd - Entering state: Started
DEBUG 2021-05-26 20:24:07,851 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163
) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown enter Start
INFO 2021-05-26 20:24:07,851 [shard 0] osd - Entering state: Start
INFO 2021-05-26 20:24:07,851 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163
) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown state<Start>: transitioning to Primary
DEBUG 2021-05-26 20:24:07,851 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163
) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown exit Start 0.000041 0 0.000000
INFO 2021-05-26 20:24:07,851 [shard 0] osd - Exiting state: Start, entered at
1622060647 .
8516333 , 0.0 spent on 0 events
DEBUG 2021-05-26 20:24:07,852 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163
) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown enter Started/Primary
INFO 2021-05-26 20:24:07,852 [shard 0] osd - Entering state: Started/Primary
DEBUG 2021-05-26 20:24:07,852 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163
) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown enter Started/Primary/Peering
INFO 2021-05-26 20:24:07,852 [shard 0] osd - Entering state: Started/Primary/Peering
DEBUG 2021-05-26 20:24:07,852 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering enter Started/Primary/Peering/GetInfo
INFO 2021-05-26 20:24:07,852 [shard 0] osd - Entering state: Started/Primary/Peering/GetInfo
...
DEBUG 2021-05-26 20:24:07,852 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering build_prior all_probe 0,1,4
DEBUG 2021-05-26 20:24:07,852 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering build_prior maybe_rw interval:139, acting: 0
DEBUG 2021-05-26 20:24:07,852 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering build_prior final: probe 0,1,4 down blocked_by {}
DEBUG 2021-05-26 20:24:07,852 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering up_thru 125 < same_since 163, must notify monitor
DEBUG 2021-05-26 20:24:07,852 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering state<Started/Primary/Peering/GetInfo>: no prior_set down osds, clearing prior_readable_until_ub
DEBUG 2021-05-26 20:24:07,852 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering state<Started/Primary/Peering/GetInfo>: querying info from osd.0
DEBUG 2021-05-26 20:24:07,852 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering state<Started/Primary/Peering/GetInfo>: querying info from osd.4
...
DEBUG 2021-05-26 20:24:07,924 [shard 0] ms - [osd.1(cluster) v2:172.21.15.39:6803/34727@61064 >> osd.4 v2:172.21.15.62:6802/34686] connect to existing
DEBUG 2021-05-26 20:24:07,924 [shard 0] ms - [osd.1(cluster) v2:172.21.15.39:6803/34727@61064 >> osd.4 v2:172.21.15.62:6802/34686] --> #62 === pg_query2(2.7 2.7 query(info 0'0 epoch_sent 163) e163/163) v1 (131)
...
DEBUG 2021-05-26 20:24:07,942 [shard 0] ms - [osd.1(cluster) v2:172.21.15.39:6803/34727@61064 >> osd.4 v2:172.21.15.62:6802/34686] GOT AckFrame: seq=62
...
<plenty of osd_ping messanging but no reply to the pg_query for 2.7>
...
DEBUG 2021-05-26 20:58:19,829 [shard 0] ms - [osd.1(hb_front) v2:172.21.15.39:6807/34727 >> osd.4 v2:172.21.15.62:6807/34686@54816] <== #772 =
== osd_ping(ping e17 up_from 10 ping_stamp 2021-05-26T20:58:19.825573+0000/2319.780029297s send_stamp 2319.780029297s) v5 (70)
DEBUG 2021-05-26 20:58:19,829 [shard 0] ms - [osd.1(hb_front) v2:172.21.15.39:6807/34727 >> osd.4 v2:172.21.15.62:6807/34686@54816] --> #772 === osd_ping(ping_reply e249 up_from 10 ping_stamp 2021-05-26T20:58:19.825573+0000/2319.780029297s send_stamp 2320.039062500s) v5 (70
```
The peering request got stuck due to awaiting for `OSDMap`.
```
DEBUG 2021-05-26 20:24:07,930 [shard 0] ms - [osd.4(cluster) v2:172.21.15.62:6802/34686 >> osd.1 v2:172.21.15.39:6803/34727@61064] <== #62 === pg_query2(2.7 2.7 query(info 0'0 epoch_sent 163) e163/163) v1 (131)
DEBUG 2021-05-26 20:24:07,930 [shard 0] osd - handle_peering_op on 2.7 from 1
DEBUG 2021-05-26 20:24:07,930 [shard 0] osd - peering_event(id=517, detail=PeeringEvent(from=1 pgid=2.7 sent=163 requested=163 evt=epoch_sent: 163 epoch_requested: 163 MQuery 2.7 from 1 query_epoch 163 query: query(info 0'0 epoch_sent 163))): star
```
```
INFO 2021-05-26 20:19:49,127 [shard 0] osd - evt epoch is 15, i have 14, will wait
INFO 2021-05-26 20:19:49,128 [shard 0] osd - osdmap_subscribe(14)
DEBUG 2021-05-26 20:19:49,128 [shard 0] ms - [osd.4(client) v2:172.21.15.62:6801/34686@63208 >> mon.1 v2:172.21.15.62:3300/0] --> #9 === mon_s
ubscribe({osdmap=14}) v3 (15)
...
INFO 2021-05-26 20:19:49,131 [shard 0] osd - handle_osd_map osd_map(14..15 src has 1..15) v4
INFO 2021-05-26 20:19:49,131 [shard 0] osd - handle_osd_map epochs [14..15], i have 15, src has [1..15]
...
INFO 2021-05-26 20:19:49,138 [shard 0] osd - handle_osd_map osd_map(14..15 src has 1..15) v4
INFO 2021-05-26 20:19:49,138 [shard 0] osd - handle_osd_map epochs [14..15], i have 15, src has [1..15]
...
INFO 2021-05-26 20:19:49,139 [shard 0] osd - evt epoch is 15, i have 14, will wait
INFO 2021-05-26 20:19:49,141 [shard 0] osd - osdmap_subscribe(14)
WARN 2021-05-26 20:19:49,141 [shard 0] monc - renew_subs - empty
...
INFO 2021-05-26 20:19:50,140 [shard 0] osd - handle_osd_map osd_map(15..16 src has 1..16) v4
INFO 2021-05-26 20:19:50,140 [shard 0] osd - handle_osd_map epochs [15..16], i have 15, src has [1..16]
DEBUG 2021-05-26 20:19:50,141 [shard 0] bluestore - do_transaction
INFO 2021-05-26 20:19:50,145 [shard 0] osd - osd.4: committed_osd_maps(16, 16)
...
INFO 2021-05-26 20:20:42,881 [shard 0] osd - handle_osd_map epochs [16..17], i have 16, src has [1..17]
DEBUG 2021-05-26 20:20:42,882 [shard 0] bluestore - do_transaction
INFO 2021-05-26 20:20:42,886 [shard 0] osd - osd.4: committed_osd_maps(17, 17)
...
INFO 2021-05-26 20:20:43,941 [shard 0] osd - evt epoch is 18, i have 17, will wait
INFO 2021-05-26 20:20:43,941 [shard 0] osd - osdmap_subscribe(17)
...
INFO 2021-05-26 20:20:43,957 [shard 0] osd - evt epoch is 18, i have 17, will wait
INFO 2021-05-26 20:20:43,957 [shard 0] osd - osdmap_subscribe(17)
...
INFO 2021-05-26 20:20:43,969 [shard 0] osd - evt epoch is 18, i have 17, will wait
INFO 2021-05-26 20:20:43,969 [shard 0] osd - osdmap_subscribe(17)
...
DEBUG 2021-05-26 20:20:46,930 [shard 0] ms - [osd.4(client) v2:172.21.15.62:6801/34686@57288 >> mon.2 v2:172.21.15.39:3301/0] <== #4 === osd_m
ap(20..21 src has 1..21) v4 (41)
INFO 2021-05-26 20:20:46,930 [shard 0] osd - handle_osd_map osd_map(20..21 src has 1..21) v4
INFO 2021-05-26 20:20:46,930 [shard 0] osd - handle_osd_map epochs [20..21], i have 17, src has [1..21]
INFO 2021-05-26 20:20:46,930 [shard 0] osd - handle_osd_map message skips epochs 18..19
INFO 2021-05-26 20:20:46,930 [shard 0] osd - osdmap_subscribe(18)
...
DEBUG 2021-05-26 20:20:47,936 [shard 0] ms - [osd.4(client) v2:172.21.15.62:6801/34686@57288 >> mon.2 v2:172.21.15.39:3301/0] <== #5 === osd_m
ap(21..22 src has 1..22) v4 (41)
INFO 2021-05-26 20:20:47,936 [shard 0] osd - handle_osd_map osd_map(21..22 src has 1..22) v4
INFO 2021-05-26 20:20:47,936 [shard 0] osd - handle_osd_map epochs [21..22], i have 17, src has [1..22]
INFO 2021-05-26 20:20:47,936 [shard 0] osd - handle_osd_map message skips epochs 18..20
INFO 2021-05-26 20:20:47,936 [shard 0] osd - osdmap_subscribe(18)
...
<osdmap_subscribe(18) over and over>
```
```
2021-05-26T20:19:42.048+0000
7f4712ffd700 1 -- [v2:172.21.15.62:3300/0,v1:172.21.15.62:6789/0] <== osd.4 v2:172.21.15.62:6801/34686 4 ==== mon_subscribe({mgrmap=0+,osd_pg_creates=0+,osdmap=0+}) v3 ==== 82+0+0 (secure 0 0 0) 0x7f46fc04e150 con 0x7f470401c480
2021-05-26T20:19:42.048+0000
7f4712ffd700 20 mon.b@1(peon) e1 _ms_dispatch existing session 0x7f46fc02f500 for osd.4
2021-05-26T20:19:42.048+0000
7f4712ffd700 20 mon.b@1(peon) e1 entity_name osd.4 global_id 4168 (new_ok) caps allow *
2021-05-26T20:19:42.048+0000
7f4712ffd700 10 mon.b@1(peon) e1 handle_subscribe mon_subscribe({mgrmap=0+,osd_pg_creates=0+,osdmap=0+}) v3
...
2021-05-26T20:19:49.129+0000
7f4712ffd700 1 -- [v2:172.21.15.62:3300/0,v1:172.21.15.62:6789/0] <== osd.4 v2:172.21.15.62:6801/34686 9 ==== mo
n_subscribe({osdmap=14}) v3 ==== 36+0+0 (secure 0 0 0) 0x7f46e8556210 con 0x7f470401c480
2021-05-26T20:19:49.129+0000
7f4712ffd700 20 mon.b@1(peon) e1 _ms_dispatch existing session 0x7f46fc02f500 for osd.4
2021-05-26T20:19:49.129+0000
7f4712ffd700 20 mon.b@1(peon) e1 entity_name osd.4 global_id 4168 (new_ok) caps allow *
2021-05-26T20:19:49.129+0000
7f4712ffd700 10 mon.b@1(peon) e1 handle_subscribe mon_subscribe({osdmap=14}) v3
2021-05-26T20:19:49.129+0000
7f4712ffd700 20 is_capable service=mon command= read addr v2:172.21.15.62:6801/34686 on cap allow *
2021-05-26T20:19:49.129+0000
7f4712ffd700 20 allow so far , doing grant allow *
2021-05-26T20:19:49.129+0000
7f4712ffd700 20 allow all
2021-05-26T20:19:49.129+0000
7f4712ffd700 20 is_capable service=osd command= read addr v2:172.21.15.62:6801/34686 on cap allow *
2021-05-26T20:19:49.129+0000
7f4712ffd700 20 allow so far , doing grant allow *
2021-05-26T20:19:49.129+0000
7f4712ffd700 20 allow all
2021-05-26T20:19:49.129+0000
7f4712ffd700 10 mon.b@1(peon).osd e15 check_osdmap_sub 0x7f46e84f0150 next 14 (onetime)
2021-05-26T20:19:49.129+0000
7f4712ffd700 5 mon.b@1(peon).osd e15 send_incremental [14..15] to osd.4
2021-05-26T20:19:49.129+0000
7f4712ffd700 10 mon.b@1(peon).osd e15 build_incremental [14..15] with features
3f01cfbb7ffdffff
2021-05-26T20:19:49.129+0000
7f4712ffd700 20 mon.b@1(peon).osd e15 build_incremental inc 15 622 bytes
2021-05-26T20:19:49.129+0000
7f4712ffd700 20 mon.b@1(peon).osd e15 build_incremental inc 14 578 bytes
2021-05-26T20:19:49.129+0000
7f4712ffd700 1 -- [v2:172.21.15.62:3300/0,v1:172.21.15.62:6789/0] --> v2:172.21.15.62:6801/34686 -- osd_map(14..
15 src has 1..15) v4 -- 0x7f46e856a100 con 0x7f470401c480
```
```
seastar::future<> Client::renew_subs()
{
if (!sub.have_new()) {
logger().warn("{} - empty", __func__);
return seastar::now();
}
logger().trace("{}", __func__);
auto m = crimson::make_message<MMonSubscribe>();
m->what = sub.get_subs();
m->hostname = ceph_get_short_hostname();
return send_message(std::move(m)).then([this] {
sub.renewed();
});
}
```
```
INFO 2021-05-26 20:19:42,081 [shard 0] osd - osdmap_subscribe(1)
DEBUG 2021-05-26 20:19:42,081 [shard 0] ms - [osd.4(client) v2:172.21.15.62:6801/34686@63208 >> mon.1 v2:172.21.15.62:3300/0] --> #6 === mon_s
ubscribe({osdmap=1}) v3 (15)
...
INFO 2021-05-26 20:19:49,128 [shard 0] osd - osdmap_subscribe(14)
DEBUG 2021-05-26 20:19:49,128 [shard 0] ms - [osd.4(client) v2:172.21.15.62:6801/34686@63208 >> mon.1 v2:172.21.15.62:3300/0] --> #9 === mon_subscribe({osdmap=14}) v3 (15)
...
INFO 2021-05-26 20:19:49,141 [shard 0] osd - osdmap_subscribe(14)
WARN 2021-05-26 20:19:49,141 [shard 0] monc - renew_subs - empty
<no MMonSubcribe>
...
INFO 2021-05-26 20:20:43,941 [shard 0] osd - evt epoch is 18, i have 17, will wait
INFO 2021-05-26 20:20:43,941 [shard 0] osd - osdmap_subscribe(17)
<no MMonSubcribe>
...
INFO 2021-05-26 20:20:46,930 [shard 0] osd - handle_osd_map message skips epochs 18..19
INFO 2021-05-26 20:20:46,930 [shard 0] osd - osdmap_subscribe(18)
<no MMonSubcribe>
```
[1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-26_12:20:26-rados-master-distro-basic-smithi/
6136908
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Radoslaw Zarzynski [Mon, 31 May 2021 23:37:04 +0000 (23:37 +0000)]
crimson/os: fix formatting in AlienStore::get_attr().
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Radoslaw Zarzynski [Mon, 31 May 2021 22:05:25 +0000 (22:05 +0000)]
crimson/os: fix use-after-free in AlienStore::get_attr().
The `FuturizedStore` interface imposes the `get_attr()`
takes the `name` parameter as `std::string_view`, and
thus burdens implementations with extending the life-
time of the data the instance refers to.
Unfortunately, `AlienStore` is unaware that prolonging
the life of a `std::string_view` instance doesn't prolong
the data memory it points to. This problem has manifested
in the following use-after-free detected at Sepia:
```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-05-26_12:20:26-rados-master-distro-basic-smithi/
6136929 $ less ./remote/smithi194/log/ceph-osd.7.log.gz
...
DEBUG 2021-05-26 20:24:54,077 [shard 0] osd - do_osd_ops_execute: object 14:
55e1a5b4 :test-rados-api-smithi067-38889-2::foo:head - handling op
call
DEBUG 2021-05-26 20:24:54,077 [shard 0] osd - handling op call on object 14:
55e1a5b4 :test-rados-api-smithi067-38889-2::foo:head
DEBUG 2021-05-26 20:24:54,078 [shard 0] osd - calling method lock.lock, num_read=0, num_write=0
DEBUG 2021-05-26 20:24:54,078 [shard 0] osd - handling op getxattr on object 14:
55e1a5b4 :test-rados-api-smithi067-38889-2::foo:head
DEBUG 2021-05-26 20:24:54,078 [shard 0] osd - getxattr on obj=14:
55e1a5b4 :test-rados-api-smithi067-38889-2::foo:head for attr=_lock.TestLockPP1
DEBUG 2021-05-26 20:24:54,078 [shard 0] bluestore - get_attr
=================================================================
==34068==ERROR: AddressSanitizer: heap-use-after-free on address 0x6030001851d0 at pc 0x7f824d6a5b27 bp 0x7f822b4201c0 sp 0x7f822b41f968
READ of size 17 at 0x6030001851d0 thread T28 (alien-store-tp)
...
#0 0x7f824d6a5b26 (/lib64/libasan.so.5+0x40b26)
#1 0x55e2cbb2e00b (/usr/bin/ceph-osd+0x2b6dc00b)
#2 0x55e2d31f086e (/usr/bin/ceph-osd+0x32d9e86e)
#3 0x55e2d3467607 in crimson::os::ThreadPool::loop(std::chrono::duration<long, std::ratio<1l, 1000l> >, unsigned long) (/usr/bin/ceph-osd+0x33015607)
#4 0x55e2d346b14a (/usr/bin/ceph-osd+0x3301914a)
#5 0x7f8249d32ba2 (/lib64/libstdc++.so.6+0xc2ba2)
#6 0x7f824a00d149 in start_thread (/lib64/libpthread.so.0+0x8149)
#7 0x7f82486edf22 in clone (/lib64/libc.so.6+0xfcf22)
0x6030001851d0 is located 0 bytes inside of 31-byte region [0x6030001851d0,0x6030001851ef)
freed by thread T0 here:
#0 0x7f824d757688 in operator delete(void*) (/lib64/libasan.so.5+0xf2688)
previously allocated by thread T0 here:
#0 0x7f824d7567b0 in operator new(unsigned long) (/lib64/libasan.so.5+0xf17b0)
Thread T28 (alien-store-tp) created by T0 here:
#0 0x7f824d6b7ea3 in __interceptor_pthread_create (/lib64/libasan.so.5+0x52ea3)
SUMMARY: AddressSanitizer: heap-use-after-free (/lib64/libasan.so.5+0x40b26)
Shadow bytes around the buggy address:
0x0c06800289e0: fd fd fd fa fa fa fd fd fd fa fa fa 00 00 00 fa
0x0c06800289f0: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
0x0c0680028a00: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa
0x0c0680028a10: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa
0x0c0680028a20: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
=>0x0c0680028a30: fd fd fa fa fd fd fd fd fa fa[fd]fd fd fd fa fa
0x0c0680028a40: fd fd fd fd fa fa fd fd fd fd fa fa 00 00 00 07
0x0c0680028a50: fa fa 00 00 00 fa fa fa 00 00 00 fa fa fa fd fd
0x0c0680028a60: fd fd fa fa fd fd fd fd fa fa fd fd fd fd fa fa
0x0c0680028a70: 00 00 00 00 fa fa fd fd fd fd fa fa fd fd fd fd
0x0c0680028a80: fa fa fd fd fd fd fa fa fd fd fd fd fa fa fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==34068==ABORTING
```
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Mykola Golub [Mon, 31 May 2021 16:34:53 +0000 (19:34 +0300)]
Merge pull request #41514 from ideepika/wip-49592-upgrade
qa/upgrade: conditionally disable update_features tests
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Mykola Golub <mgolub@suse.com>
Kefu Chai [Mon, 31 May 2021 12:07:33 +0000 (20:07 +0800)]
Merge pull request #41589 from tchaikov/wip-crimson-start-up-error
crimson: handle startup failures properly
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Kefu Chai [Sat, 29 May 2021 08:24:59 +0000 (16:24 +0800)]
crimson/os/alienstore: do not cleanup if not started
there is chance stop() and umount() methods get called even if start()
is not called in the error handling path. in that case, just make these
methods no-op. to ensure that OSD behaves in that case.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 29 May 2021 08:03:50 +0000 (16:03 +0800)]
crimson/os/alienstore: create tp in AlienStore::start()
thread pool is not needed until AlienStore::start(). with this change,
we are able to tell if the AlienStore is actually started or not in
AlienStore::stop().
as seastar::sharded<Service> start a service in two phases:
1. construct the shard instances
2. actually start them
and it stops a service in a single shot, which both stops the services
and destructs the service instance(s).
so we have to implement a proper stop() method for services whose
start() might not be called after its instance is created by
seastar::sharded<Service>::start() in case of error handling or if
we just don't want to call start().
to ensure we can skip the steps to clean up the stuff created by
start(), we need to have a flag in the sharded service, because
AlienStore is a member variable of OSD, and when we do mkfs, AlienStore
is not start()'ed, and as explained above, we have to call OSD::stop()
to ensure OSD instance is destructed properly. but OSD::stop()
calls store->umount() and store->stop() unconditionally. these methods
in AlienStore rely on a functional thread pool.
fortunately, we don't need to call these methods if the store is never
mounted or started. in a case of failed "mkfs", store is not mounted at
all but the store and osd instances are created.
so, in this change, thread pool is created in AlienStore::start(), and
we will use it to tell if AlienStore is started or not in the following
change which makes the related method no-op if AlienStore is not started
yet.
also, postpone the creation of `store` until in AlienStore::start(), so
we don't need to destroy it in the dtor of AlienStore. otherwise,
BlueStore::~BlueStore() would need to reference resources which are only
available in alien threads, but when OSD::~OSD() is called, we are in
seastar's reactor.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 29 May 2021 07:08:18 +0000 (15:08 +0800)]
crimson/osd/main: always stop osd as long as it started
otherwise the sharded_service's dtor complains if we destruct it without
stopping it first, like:
FATAL: startup failed: std::system_error (error crimson::net:3, negotiation failure)
crimson-osd: ../src/seastar/include/seastar/core/sharded.hh:523: seastar::sharded<T>::~sharded() [with Service = crimson::osd::OSD]: Assertion `_instances.empty()' failed.
Aborting on shard 0.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 29 May 2021 07:03:01 +0000 (15:03 +0800)]
crimson/osd/main: do cleanup using defer()
since we do the startup in a seastar thread, we have the luxury of doing
cleanup using the RAII machinery.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 29 May 2021 06:51:09 +0000 (14:51 +0800)]
crimson/osd/main: catch exception thrown in the async() call
* use seastar::app_template::run() instead of
seastar::app_template::run_deprecated() for returning int,
instead of returning `void`. so the application can return
int explicitly in the continuation passed to run(). more
readable this way.
* wrap the all the block in run() in a giant try-catch block,
so the exceptions thrown by the startup code can be captured
and handled.
* do not capture the exceptions individually, in the try-catch
block anymore. the outer catch block takes care of them.
this change improves the error handling when crimson-osd launches.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Deepika [Mon, 24 May 2021 21:20:39 +0000 (21:20 +0000)]
qa/upgrade: conditionally disable update_features tests
with the recent support for async rbd operations from pacific+ when an
older client(non async support) goes on upgrade, and simultaneously
interacts with a newer client which expects the requests to be async,
experiences hang; considering the return code for request completion to
be acknowledgement for async request, which then keeps waiting for
another acknowledgement of request completion.
this if happens should be a rare only when lockowner is an old client
and should be deferred if compatibility issues arises.
see also:
541230475d3b25ab18c4eb9bc5011060462594a6 (octopus)
Signed-off-by: Deepika <dupadhya@redhat.com>
Kefu Chai [Mon, 31 May 2021 01:40:50 +0000 (09:40 +0800)]
Merge pull request #41552 from tchaikov/wip-mgr-find-roots
mgr: expose CRUSHMap.find_roots()
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
J. Eric Ivancich [Sat, 29 May 2021 16:18:45 +0000 (12:18 -0400)]
Merge pull request #41563 from cybozu/rgw-add-the-description-of-blocking-io-during-index-resharding
rgw: add the description of blocking io during index resharding
Reviewed-by: Matt Benjamin mbenjamin@redhat.com
Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
Kefu Chai [Sat, 29 May 2021 06:48:11 +0000 (14:48 +0800)]
crimson/osd/main: handle and rethrow exception in fetch_config()
print more verbose error message when monc fails to connect to moitor.
for better user experience.
also, unregister all dispatchers by calling msgr->stop() before calling
monc.stop() to ensure the messenger can be shutdown gracefully.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 29 May 2021 05:45:41 +0000 (13:45 +0800)]
test/crimson/test_messenger: add editor variables in header
to help emacs and vim to format the code better.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 29 May 2021 05:44:29 +0000 (13:44 +0800)]
crimson/osd/main: do cleanup using defer() in fetch_config()
so we can stop the started services even if some of the step(s) throw or
fail.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 29 May 2021 03:52:45 +0000 (11:52 +0800)]
vstart.sh: remove unused variable
osdmap_fn is not used after being initialized, so drop it.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 29 May 2021 02:42:14 +0000 (10:42 +0800)]
Merge pull request #41278 from sebastian-philipp/mgr-cephadm-set-user-no-hosts
mgr/cephadm: Don't call _check_host without hosts
Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
Reviewed-by: Adam King <adking@redhat.com>
Kefu Chai [Sat, 29 May 2021 02:37:31 +0000 (10:37 +0800)]
Merge pull request #41520 from tchaikov/wip-osd-unique-ptr
os: let ObjectStore::create() return unique_ptr<>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Kefu Chai [Sat, 29 May 2021 02:36:43 +0000 (10:36 +0800)]
Merge pull request #41573 from tchaikov/wip-allocat-ctor
os/bluestore: pass string_view to ctor of Allocator
Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Kefu Chai [Fri, 28 May 2021 07:35:01 +0000 (15:35 +0800)]
Merge pull request #41582 from cyx1231st/wip-seastore-swap-read-extent
crimson/seastore: introduce and adopt LBAManager::get_mapping(t, offset)
Reviewed-by: Kefu Chai <kchai@redhat.com>
Yingxin Cheng [Thu, 27 May 2021 15:33:25 +0000 (23:33 +0800)]
crimson/seastore: adopt get_mapping(t, offset) interface
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
Yingxin Cheng [Thu, 27 May 2021 08:48:47 +0000 (16:48 +0800)]
crimson/seastore: implement and test get_mapping(t, laddr)
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
Yingxin Cheng [Thu, 27 May 2021 07:02:15 +0000 (15:02 +0800)]
crimson/seastore: add stub to introduce get_mapping() without length
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
Kefu Chai [Fri, 28 May 2021 00:09:07 +0000 (08:09 +0800)]
Merge pull request #41578 from rzarzynski/wip-crimson-monc-auth-req
crimson/monc: handle_auth_request() doesn't depend on active_con.
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Thu, 27 May 2021 23:59:34 +0000 (07:59 +0800)]
Merge pull request #41544 from tchaikov/wip-doc-confval
doc/mgr: use confval directive to define options
Reviewed-by: Neha Ojha <nojha@redhat.com>
Kefu Chai [Wed, 26 May 2021 04:00:57 +0000 (12:00 +0800)]
doc/mgr: use confval directive to define options
less repeating this way
Signed-off-by: Kefu Chai <kchai@redhat.com>
Yuri Weinstein [Thu, 27 May 2021 23:40:41 +0000 (16:40 -0700)]
Merge pull request #41540 from ceph/wip-15213
doc: 15.2.13 Release Notes
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Sage Weil [Thu, 27 May 2021 23:14:53 +0000 (19:14 -0400)]
Merge PR #41483 into master
* refs/pull/41483/head:
cephadm: stop passing --no-hosts to podman
mgr/nfs: use host.addr for backend IP where possible
mgr/cephadm: convert host addr if non-IP to IP
mgr/dashboard,prometheus: new method of getting mgr IP
doc/cephadm: remove any reference to the use of DNS or /etc/hosts
mgr/cephadm: use known host addr
mgr/cephadm: resolve IP at 'orch host add' time
Reviewed-by: Sebastian Wagner <swagner@suse.com>
zdover23 [Thu, 27 May 2021 21:41:40 +0000 (07:41 +1000)]
Merge pull request #41561 from zdover23/wip-doc-cephadm-s-mgmt-service-status-improvement-2021-05-26
doc/cephadm: enrich "service status"
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Sage Weil [Tue, 25 May 2021 17:55:08 +0000 (13:55 -0400)]
cephadm: stop passing --no-hosts to podman
This reverts
cfc1f914ce74f1fd1f45e2efd3ba2ddcb2da129a , which is no longer
neceesary because (1) we don't use socket.getfqdn(), and (2) we generally
do not rely on DNS or /etc/hosts at all anymore (with the exception of
the upgrade transition).
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 26 May 2021 22:38:05 +0000 (18:38 -0400)]
mgr/nfs: use host.addr for backend IP where possible
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 25 May 2021 20:10:49 +0000 (16:10 -0400)]
mgr/cephadm: convert host addr if non-IP to IP
Previously we allowed the host.addr to be a DNS name (short or fqdn).
This is problematic because of the inconsistent way that docker and podman
handle /etc/hosts, and undesirable because relying on external DNS is
an external source of failure for the cluster without any benefit in
return (simply updating DNS is not sufficient to make ceph behave).
So: update any non-IP to an IP as soon as we start up (presumably on
upgrade). If we get a loopback address (127.0.0.1 or 127.0.1.1), then
wait and hope that the next instance of the manager has better luck.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 25 May 2021 17:00:35 +0000 (13:00 -0400)]
mgr/dashboard,prometheus: new method of getting mgr IP
- Use a centralized method get_mgr_ip()
- Look up the hostname via DNS. This is a bit more reliable than
getfqdn() since it will work even when podman adds the container
name to /etc/hosts.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 25 May 2021 16:14:39 +0000 (12:14 -0400)]
doc/cephadm: remove any reference to the use of DNS or /etc/hosts
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Fri, 21 May 2021 17:31:31 +0000 (13:31 -0400)]
mgr/cephadm: use known host addr
If the host IP/addr is known, use that. The addr might even be a FQDN
instead of an IP address, in which case we want to look that up instead
of the bare hostname.
Signed-off-by: Sage Weil <sage@newdream.net>
Radoslaw Zarzynski [Thu, 27 May 2021 14:55:40 +0000 (14:55 +0000)]
crimson/monc: handle_auth_request() doesn't depend on active_con.
Following crash occured at Sepia [1]:
```
INFO 2021-05-26 20:16:32,872 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] ProtocolV2::start_accept(): targ
et_addr=172.21.15.119:55220/0
DEBUG 2021-05-26 20:16:32,872 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] TRIGGER ACCEPTING, was NONE
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] SEND(26) banner: len_payload=16,
supported=1, required=0, banner="ceph v2
"
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] RECV(10) banner: "ceph v2
"
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] GOT banner: payload_len=16
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] RECV(16) banner features: supported=1 required=0
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] WRITE HelloFrame: my_type=osd, peer_addr=172.21.15.119:55220/0
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] GOT HelloFrame: my_type=client peer_addr=v2:172.21.15.119:6803/31733
INFO 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> client.? -@55220] UPDATE: peer_type=client, policy(lossy=true server=true standby=false resetcheck=false)
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> client.? -@55220] GOT AuthRequestFrame: method=2, preferred_modes={1, 2}, payload_len=174
/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/
17.0.0-4622-gaa1dc559 /rpm/el8/BUILD/
ceph-17.0.0-4622-gaa1dc559 /src/crimson/mon/MonClient.cc:399:10: runtime error: member access within null pointer of type 'struct Connection'
Segmentation fault on shard 0.
Backtrace:
0# 0x000055E84CF44C1F in ceph-osd
1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
3# 0x00007F2BC88C0B20 in /lib64/libpthread.so.0
4# crimson::mon::Connection::get_conn() in ceph-osd
5# crimson::mon::Client::handle_auth_request(seastar::shared_ptr<crimson::net::Connection>, seastar::lw_shared_ptr<AuthConnectionMeta>, bool, unsigned int, ceph::buffer::v15_2_0::list const&, ceph::buffer::v15_2_0::list*) in ceph-osd
6# crimson::net::ProtocolV2::_handle_auth_request(ceph::buffer::v15_2_0::list&, bool) in ceph-osd
7# 0x000055E84DF67669 in ceph-osd
8# 0x000055E84DF68775 in ceph-osd
9# 0x000055E846F47F60 in ceph-osd
10# 0x000055E85296770F in ceph-osd
11# 0x000055E85296CC50 in ceph-osd
12# 0x000055E852B1ECBB in ceph-osd
13# 0x000055E85267C73A in ceph-osd
14# main in ceph-osd
15# __libc_start_main in /lib64/libc.so.6
16# _start in ceph-osd
Fault at location: 0x98
```
[1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-26_12:20:26-rados-master-distro-basic-smithi/
6136907
When the `handle_auth_request()` happens, there is no guarantee
`active_con` is being available. This is reflected in the classical
implementation:
```cpp
int MonClient::handle_auth_request(
Connection *con,
// ...
ceph::buffer::list *reply)
{
// ...
bool isvalid = ah->verify_authorizer(
cct,
*rotating_secrets,
payload,
auth_meta->get_connection_secret_length(),
reply,
&con->peer_name,
&con->peer_global_id,
&con->peer_caps_info,
&auth_meta->session_key,
&auth_meta->connection_secret,
ac);
```
The patch transplate the same logic to crimson.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Kefu Chai [Thu, 27 May 2021 14:26:05 +0000 (22:26 +0800)]
os/bluestore: pass string_view to ctor of Allocator
just for the sake of correctness, as they don't need a full-blown
std::string, what they need is but a string like object. and they always
create a std::string instance as a member variable if they want to have
a copy of it.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Thu, 27 May 2021 15:14:36 +0000 (23:14 +0800)]
tools/ceph_objectstore_tool: destruct ObjectStore using unique_ptr<>
before this change, cot never destructs the created ObjectStore
instances.
after this change, they are destructed upon returning from main().
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Thu, 27 May 2021 03:08:48 +0000 (11:08 +0800)]
osd: pass unique_ptr<ObjectStore> to ctor of OSD
less error-prone, and it's simpler to manage the resource using RAII
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Tue, 25 May 2021 07:43:47 +0000 (15:43 +0800)]
osd/OSD: remove unused include headers
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Tue, 25 May 2021 07:41:26 +0000 (15:41 +0800)]
osd/OSD: use scope_guard to umount objecstore
RAII can simplify the clean up logic in OSD::mkfs().
and since `ch` is a smart pointer, so it is able to take care of itself,
as long as we ensure that it is destructed before objectstore.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Tue, 25 May 2021 07:34:34 +0000 (15:34 +0800)]
osd: pass unique_ptr<ObjectStore> to OSD::mkfs()
less error prune this way.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Tue, 25 May 2021 07:18:21 +0000 (15:18 +0800)]
os: let ObjectStore::create() return unique_ptr<>
instead of returning a raw pointer of ObjectStore, let
`ObjectStore::create()` return a `std::unique_ptr<ObjectStore>`.
less error prune this way.
Signed-off-by: Kefu Chai <kchai@redhat.com>
ofriedma [Thu, 27 May 2021 14:46:41 +0000 (17:46 +0300)]
Merge pull request #41495 from pleiadesian/patch-quota-cache
rgw: remove quota soft threshold
ofriedma [Thu, 27 May 2021 14:32:08 +0000 (17:32 +0300)]
Merge pull request #41288 from ofriedma/wip-ofriedma-segfault
rgw: crash on multipart upload to bucket with policy
Ilya Dryomov [Thu, 27 May 2021 13:23:42 +0000 (15:23 +0200)]
Merge pull request #41529 from Yenya/rbd-deep-cp-docs
doc/rbd: document cp versus deep cp
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Jan "Yenya" Kasprzak [Tue, 25 May 2021 11:43:52 +0000 (13:43 +0200)]
doc/rbd: document cp versus deep cp
I found that the difference between "rbd cp" and "rbd deep cp",
i.e. what "deep" means in this context, is documented only in
the mailing list archive and in the Mimic reelase notes.
Let's make the difference explicit in the manpage and in rbd --help.
Signed-off-by: Jan "Yenya" Kasprzak <kas@fi.muni.cz>
Sebastian Wagner [Thu, 27 May 2021 09:54:24 +0000 (11:54 +0200)]
Merge pull request #41224 from adk3798/change-mon-stack-images-docs
doc/cephadm: recommend redeploying monitoring stack daemon after changing image
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Kefu Chai [Thu, 27 May 2021 09:39:30 +0000 (17:39 +0800)]
Merge pull request #41566 from anthonyeleven/anthonyeleven/update-rgw-yaml-in
src/common/options: improve spelling, capitalization, and wording in rgw.yml.in
Reviewed-by: Kefu Chai <kchai@redhat.com>
Sebastian Wagner [Thu, 27 May 2021 09:36:33 +0000 (11:36 +0200)]
Merge pull request #41400 from liewegas/fix-50113
doc/releases/pacific: add note about rgw on upgrade
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Milind Changire [Thu, 27 May 2021 08:09:23 +0000 (13:39 +0530)]
Merge pull request #40831 from vshankar/wip-cephfs-mirror-incremental-sync
cephfs-mirror: incremental sync
Reviewed-by: Milind Changire <mchangir@redhat.com>
Ilya Dryomov [Thu, 27 May 2021 07:58:32 +0000 (09:58 +0200)]
Merge pull request #41279 from pkalever/promote-attach
rbd: promote rbd-nbd attach and detach at rbd integrated cli
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Kefu Chai [Thu, 27 May 2021 07:23:44 +0000 (15:23 +0800)]
Merge pull request #41378 from varshar16/wip-check-file-inputs-nfs
pybind/mgr: generalize CLICheckNonemptyFileInput() error msg
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Alfonso MartÃnez <almartin@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Kefu Chai [Thu, 27 May 2021 07:21:47 +0000 (15:21 +0800)]
Merge pull request #41381 from AmnonHanuhov/wip-Refactor_PeeringState
crimson/osd: Refactor PeeringState
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Kefu Chai [Thu, 27 May 2021 07:19:12 +0000 (15:19 +0800)]
Merge pull request #41516 from tchaikov/wip-47380
mon/OSDMonitor: drop stale failure_info even if can_mark_down()
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Kefu Chai [Thu, 27 May 2021 07:17:48 +0000 (15:17 +0800)]
Merge pull request #41546 from tchaikov/wip-crush-alignment
crush/crush: ensure alignof(crush_work_bucket) is 1
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Kefu Chai [Thu, 27 May 2021 07:17:11 +0000 (15:17 +0800)]
Merge pull request #41517 from tchaikov/wip-osd-osd-types
osd/osd_type: use f->dump_unsigned() when appropriate
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Kefu Chai [Thu, 27 May 2021 07:16:07 +0000 (15:16 +0800)]
Merge pull request #41527 from t-msn/cleanup-peeringstate-init
osd/PeeringState: cleanup dead code in PeeringState::init
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Thu, 27 May 2021 06:13:24 +0000 (14:13 +0800)]
Merge pull request #41565 from anthonyeleven/anthonyeleven/update-rgw-chunk
doc/radosgw: modernize reference to rgw_max_chunk_size
Reviewed-by: Kefu Chai <kchai@redhat.com>
Anthony D'Atri [Thu, 27 May 2021 05:47:06 +0000 (22:47 -0700)]
src/common/options: improve spelling, capitalization, and wording
Signed-off-by: Anthony D'Atri <anthony.datri@gmail.com>
Anthony D'Atri [Thu, 27 May 2021 05:37:33 +0000 (22:37 -0700)]
doc/radosgw: modernize reference to rgw_max_chunk_size
The value changed from 512KB to 4MB in Kraken. Reference the prevailing
option default instead of embedding the current value.
Signed-off-by: Anthony D'Atri anthony.datri@gmail.com
Samuel Just [Thu, 27 May 2021 05:05:05 +0000 (22:05 -0700)]
Merge pull request #41564 from tchaikov/wip-dmclock-seastar
dmclock: pick up change to support seastar
Reviewed-by: Samuel Just <sjust@redhat.com>
Samuel Just [Thu, 27 May 2021 05:04:19 +0000 (22:04 -0700)]
Merge pull request #41560 from athanatos/sjust/wip-clang-linker-problem
crimson/os/seastore: resolve clang build problems, misc cleanups
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Thu, 27 May 2021 03:31:32 +0000 (11:31 +0800)]
dmclock: pick up change to support seastar
so if WITH_SEASTAR is defined, the POSIX synchronous primitives
are either replaced with seastar counterparts or disabled.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Satoru Takeuchi [Thu, 27 May 2021 02:09:39 +0000 (02:09 +0000)]
rgw: add the description of blocking io during index resharding
It's nice to describe that write I/Os are blocked during resharding.
Signed-off-by: Satoru Takeuchi <satoru.takeuchi@gmail.com>
Zac Dover [Thu, 27 May 2021 01:28:38 +0000 (11:28 +1000)]
doc/cephadm: enrich "service status"
This PR improves the syntax of the "Service
Status" section of the "Service Managment"
section of the cephadm guide. This includes
pretty significant reworking of the information
in the section, so vetting this one might be
annoying. Anyway, I think I've lowered the
cognitive load on the reader.
Signed-off-by: Zac Dover <zac.dover@gmail.com>
Samuel Just [Wed, 26 May 2021 23:57:12 +0000 (16:57 -0700)]
crimson/os/seastore/seastore: add helpers to simplify omap usage
Add _omap_get_values and _omap_get_value to clarify omap_get_values and
get_attr. Also resolves a clang linker error.
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Wed, 26 May 2021 22:40:32 +0000 (15:40 -0700)]
crimson/os/seastore: use tuple return for omap_list throughout
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Wed, 26 May 2021 22:39:34 +0000 (15:39 -0700)]
crimson/os/seastore/seastore.h: remove unncessary whitespace
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Wed, 26 May 2021 22:39:12 +0000 (15:39 -0700)]
crimson/os/seastore/seastore.h: remove non-const repeat_with_onode
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Wed, 26 May 2021 22:38:44 +0000 (15:38 -0700)]
crimson/os/futurized_store: use ceph::bufferlist to match
Signed-off-by: Samuel Just <sjust@redhat.com>
Sage Weil [Wed, 26 May 2021 22:42:29 +0000 (18:42 -0400)]
Merge PR #41351 into master
* refs/pull/41351/head:
cephadm: clean-up error message
cephadm: raise an error when `--config` file is not found
Reviewed-by: Sage Weil <sage@redhat.com>
Sage Weil [Wed, 26 May 2021 22:42:06 +0000 (18:42 -0400)]
Merge PR #41283 into master
* refs/pull/41283/head:
cephadm: manage cephadm log with logrotated
Reviewed-by: Sebastian Wagner <swagner@suse.com>
Sage Weil [Fri, 21 May 2021 16:32:49 +0000 (12:32 -0400)]
mgr/cephadm: resolve IP at 'orch host add' time
We prefer to always have a real IP for hosts in the cluster. This avoids
a reliance on DNS for most operations.
Perhaps more importantly, it means we are less sensitive to inconsistent
host lookup results, for example due to (1) mismatched /etc/hosts files
between machines, or (2) a lookup of the local hostname that returns
127.0.1.1.
Adjust with_hosts() fixture to take an addr, and adjust tests accordingly.
Signed-off-by: Sage Weil <sage@newdream.net>
Neha Ojha [Wed, 26 May 2021 21:36:59 +0000 (21:36 +0000)]
doc/releases/octopus.rst: rados updates for 15.2.13
Signed-off-by: Neha Ojha <nojha@redhat.com>
Adam C. Emerson [Wed, 26 May 2021 17:52:57 +0000 (13:52 -0400)]
Merge pull request #41465 from adamemerson/wip-50169
rgw: Simplify log shard probing and err on the side of omap
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Kefu Chai [Wed, 26 May 2021 15:03:46 +0000 (23:03 +0800)]
Merge pull request #41554 from rzarzynski/wip-crimson-simplify-ox-lt-mgmt
crimson/osd: simplify the management of OpsExecuter's life-time.
Reviewed-by: Kefu Chai <kchai@redhat.com>
Radoslaw Zarzynski [Wed, 26 May 2021 13:20:52 +0000 (13:20 +0000)]
crimson/osd: simplify the management of OpsExecuter's life-time.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Sage Weil [Wed, 26 May 2021 14:12:06 +0000 (10:12 -0400)]
Merge PR #41510 into master
* refs/pull/41510/head:
doc/cephfs/nfs: remove documented limitation
Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Kefu Chai [Wed, 26 May 2021 12:05:29 +0000 (20:05 +0800)]
mgr: expose CRUSHMap.find_roots()
so mgr module could use it to enumerate all nodes without parents
See-also: https://tracker.ceph.com/issues/50971
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Wed, 26 May 2021 10:44:31 +0000 (18:44 +0800)]
Merge pull request #41547 from t-msn/wip-update-cephspec
ceph.spec.in: install gcc-toolset-9-gcc-c++ for rhel only
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Wed, 26 May 2021 06:58:33 +0000 (14:58 +0800)]
crush/crush: ensure alignof(crush_work_bucket) is 1
in do_rule(), we allocate the space for crush_work_bucket using
char work[crush_work_size(crush, maxout)];
where crush_work_size() calculate the size like:
map->working_size + result_max * 3 * sizeof(__u32);
so work is allocated on stack, but the alignment of the
crush_work_bucket struct is not taken into consideration, so in
crush_init_workspace(), point could point to an address which is not
aligned to 8 bytes, which is the alignment of crush_work_bucket by
default. so is its member variables, all of them are uint32_t, and hence
are also 8-bytes aligned.
to ensure the compiler generate the correct assembly for accessing
the member variables without assuming that the struct is 8-byte
aligned, we should specify the alignment explicitly.
in this change, `__attribute__ ((packed))` is specified for
crush_work_bucket, so that its alignment is 1.
this issue is spotted by ASan, it complains like:
../src/crush/mapper.c:881:22: runtime error: member access within misaligned address 0x7ffe051f90dc for type 'struct crush_work_bucket', which requires 8 byte alignment
0x7ffe051f90dc: note: pointer points here
1d e5 77 3d 68 55 00 00 00 00 00 00 00 00 00 00 20 93 1f 05 fe 7f 00 00 10 91 1f 05 fe 7f 00 00
^
../src/crush/mapper.c:882:22: runtime error: member access within misaligned address 0x7ffe051f90dc for type 'struct crush_work_bucket', which requires 8 byte alignment
0x7ffe051f90dc: note: pointer points here
1d e5 77 3d 00 00 00 00 00 00 00 00 00 00 00 00 20 93 1f 05 fe 7f 00 00 10 91 1f 05 fe 7f 00 00
^
../src/crush/mapper.c:883:20: runtime error: member access within misaligned address 0x7ffe051f90dc for type 'struct crush_work_bucket', which requires 8 byte alignment
0x7ffe051f90dc: note: pointer points here
1d e5 77 3d 00 00 00 00 00 00 00 00 00 00 00 00 20 93 1f 05 fe 7f 00 00 10 91 1f 05 fe 7f 00 00
^
Fixes: https://tracker.ceph.com/issues/50978
Signed-off-by: Kefu Chai <kchai@redhat.com>
Zulai Wang [Sat, 22 May 2021 13:21:10 +0000 (21:21 +0800)]
rgw: remove quota soft threshold
Remove quota soft threshold, which causes expensive checks for sharded buckets
Fixes: 14eabd4aa7b8a2e2c0c43fe7f877ed2171277526
Signed-off-by: Zulai Wang <wangzl31@outlook.com>
Misono Tomohiro [Wed, 26 May 2021 07:10:35 +0000 (16:10 +0900)]
ceph.spec.in: install gcc-toolset-9-gcc-c++ for rhel only
Otherwise fedora 33 complains there is no gcc-toolset-9-gcc-c++
when running "WITH_SEASTAR=true ./install_deps.sh"
Related to:
36759b53635
Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
Kefu Chai [Wed, 26 May 2021 07:18:22 +0000 (15:18 +0800)]
Merge pull request #41545 from tchaikov/wip-vstart-fix
vstart.sh: pass the addition option to parse_block_devs()
Reviewed-by: Samuel Just <sjust@redhat.com>
Varsha Rao [Wed, 19 May 2021 08:12:04 +0000 (13:42 +0530)]
mgr/dashboard/access_control: fix flake8 expected 2 blank lines error
Signed-off-by: Varsha Rao <varao@redhat.com>
Varsha Rao [Tue, 18 May 2021 09:16:32 +0000 (14:46 +0530)]
mgr/nfs: use CLICheckNonemptyFileInput decorator
Fixes: https://tracker.ceph.com/issues/50858
Signed-off-by: Varsha Rao <varao@redhat.com>
Varsha Rao [Tue, 18 May 2021 09:12:29 +0000 (14:42 +0530)]
pybind/mgr: generalize CLICheckNonemptyFileInput() error msg
Signed-off-by: Varsha Rao <varao@redhat.com>
Varsha Rao [Mon, 17 May 2021 13:37:53 +0000 (19:07 +0530)]
pybind/mgr: check if file contains only spaces
Signed-off-by: Varsha Rao <varao@redhat.com>
Kefu Chai [Wed, 26 May 2021 06:02:51 +0000 (14:02 +0800)]
vstart.sh: use || instead of "-o"
to silence the warning like:
SC2166: Prefer [ p ] || [ q ] as [ p -o q ] is not well defined.
see also
https://pubs.opengroup.org/onlinepubs/
9699919799 /utilities/test.html
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Wed, 26 May 2021 06:00:37 +0000 (14:00 +0800)]
vstart.sh: pass the addition option to parse_block_devs()
to address the regression introduced by
3ea5242e381a850c080ee9edbaeea28059ad4da9
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Wed, 26 May 2021 05:55:03 +0000 (13:55 +0800)]
Merge pull request #41543 from runsisi/wip-fix-clay-doc
doc: add missing crush-device-class={device-class} pair for clay code profile
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Wed, 26 May 2021 04:58:37 +0000 (12:58 +0800)]
Merge pull request #41542 from tchaikov/wip-vstart-cleanup
vstart: cleanups
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
Prasanna Kumar Kalever [Tue, 25 May 2021 12:24:29 +0000 (17:54 +0530)]
qa/workunits/rbd: use rbd cli for device attach/detach commands
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Prasanna Kumar Kalever [Mon, 17 May 2021 09:40:45 +0000 (15:10 +0530)]
rbd: improve conditional compilation specific checks
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Prasanna Kumar Kalever [Thu, 6 May 2021 07:27:56 +0000 (12:57 +0530)]
rbd: promote rbd-nbd attach and detach at rbd integrated cli
Example:
$ rbd device attach rbd-pool/image --device /dev/nbd0 --device-type nbd --force
$ rbd device detach rpool/image --device-type nbd
for now returning EOPNOTSUPP with krbd, ggate and wnbd
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Samuel Just [Tue, 25 May 2021 00:45:16 +0000 (17:45 -0700)]
crimson/os/seastore/logging.h: use ##__VA_ARGS__ rather than __VA_OPT__
This seems to work with both clang and gcc for now.
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Tue, 25 May 2021 00:46:23 +0000 (17:46 -0700)]
crimson/.../staged-fltree/tree_utils: fix cursor binding
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Wed, 26 May 2021 04:43:27 +0000 (04:43 +0000)]
test/crimson/test_backfill: fix captured bindings
Signed-off-by: Samuel Just <sjust@redhat.com>