]>
git.apps.os.sepia.ceph.com Git - ceph-ci.git/log 
Radoslaw Zarzynski  [Tue, 19 Jan 2021 16:05:12 +0000  (17:05 +0100)] 
crimson: improve const-correctness of Operation::dump()s.Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com> 
Kefu Chai  [Thu, 3 Jun 2021 01:36:15 +0000  (09:36 +0800)] 
Merge pull request #41627 from tchaikov/wip-mgr-repl-docReviewed-by: Pere Diaz Bou <pdiazbou@redhat.com> 
Kefu Chai  [Thu, 3 Jun 2021 01:34:56 +0000  (09:34 +0800)] 
Merge pull request #41138 from kalebskeithley/python39Reviewed-by: Kefu Chai <kchai@redhat.com> 
Kefu Chai  [Thu, 3 Jun 2021 01:29:19 +0000  (09:29 +0800)] 
do_cmake: build with python3.9 on RHEL9Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Signed-off-by: Kefu Chai <kchai@redhat.com> 
Kefu Chai  [Thu, 3 Jun 2021 01:16:42 +0000  (09:16 +0800)] 
Merge pull request #41496 from Huber-ming/correct_spellReviewed-by: Kefu Chai <kchai@redhat.com> 
Patrick Donnelly  [Wed, 2 Jun 2021 15:18:22 +0000  (08:18 -0700)] 
Merge PR #41635 into masterReviewed-by: Ramana Raja <rraja@redhat.com> 
Kefu Chai  [Wed, 2 Jun 2021 14:43:40 +0000  (22:43 +0800)] 
Merge pull request #41644 from rzarzynski/wip-crimson-fix-blocked-peeringReviewed-by: Kefu Chai <kchai@redhat.com> 
Sage Weil  [Wed, 2 Jun 2021 14:27:03 +0000  (10:27 -0400)] 
Merge PR #41651 into masterReviewed-by: Kefu Chai <kchai@redhat.com> 
Kefu Chai  [Wed, 2 Jun 2021 14:10:12 +0000  (22:10 +0800)] 
Merge pull request #41645 from tchaikov/wip-crimson-osd-mkfsReviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com> 
Zac Dover  [Wed, 2 Jun 2021 14:06:06 +0000  (00:06  +1000)] 
doc/cephadm: s/the the/theSigned-off-by: Zac Dover <zac.dover@gmail.com> 
Kefu Chai  [Wed, 2 Jun 2021 12:57:14 +0000  (20:57 +0800)] 
crimson/osd: check existing superblock when mkfsSigned-off-by: Kefu Chai <kchai@redhat.com> 
Kefu Chai  [Wed, 2 Jun 2021 12:47:03 +0000  (20:47 +0800)] 
crimson/osd: extract OSD::_write_superblock() outSigned-off-by: Kefu Chai <kchai@redhat.com> 
Radoslaw Zarzynski  [Wed, 2 Jun 2021 11:59:37 +0000  (11:59 +0000)] 
crimson/monc: fix subscription stall that blocked peering.
There is a scenario when the `active_con` is properly
chosen but isn't marked as `ready_to_send`.
If `renew_subs()` is called during the `on_session_opened()`,
the flag will be turned on after the subscriptions are
renewed which cannot happen as it requires the flag to be
already set. In other words: there is a circular data dependency.
The net result is stalling the subscription machinery,
particularly the `OSDMap` subs. This caused a nasty peering
issue at Sepia [1] where PG 2.7 got stuck in the `GetInfo`
state.
```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-05-26_12:20:26-rados-master-distro-basic-smithi/
6136908 $ less ./remote/smithi039/log/ceph-osd.1.log.gz
...
DEBUG 2021-05-26 20:19:48,134 [shard 0] osd -  pg_epoch 14 pg[2.7( DNE empty local-lis/les=0/0 n=0 ec=0/0 lis/c=0/0 les/c/f=0/0/0 sis=0) [] r=
-1 lpr=0 crt=0'0 mlcod 0'0 unknown enter Initial
...
DEBUG 2021-05-26 20:19:48,138 [shard 0] osd -  pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0]
r=0 lpr=0 crt=0'0 mlcod 0'0 unknown enter Reset
...
DEBUG 2021-05-26 20:19:48,138 [shard 0] osd -  pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 unknown enter Started
...
DEBUG 2021-05-26 20:19:48,138 [shard 0] osd -  pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 unknown enter Start
...
DEBUG 2021-05-26 20:19:48,138 [shard 0] osd -  pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 unknown enter Started/Primary
...
DEBUG 2021-05-26 20:19:48,138 [shard 0] osd -  pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating enter Started/Primary/Peering
...
DEBUG 2021-05-26 20:19:48,138 [shard 0] osd -  pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering enter Started/Primary/Peering/GetInfo
DEBUG 2021-05-26 20:19:48,138 [shard 0] osd -  pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering build_prior all_probe
DEBUG 2021-05-26 20:19:48,139 [shard 0] osd -  pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering build_prior final: probe 0,1 down  blocked_by {}
DEBUG 2021-05-26 20:19:48,139 [shard 0] osd -  pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering up_thru 0 < same_since 14, must notify monitor
DEBUG 2021-05-26 20:19:48,139 [shard 0] osd -  pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering state<Started/Primary/Peering/GetInfo>:  no prior_set down osds, clearing prior_readable_until_ub
DEBUG 2021-05-26 20:19:48,139 [shard 0] osd -  pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering state<Started/Primary/Peering/GetInfo>:  querying info from osd.0
...
DEBUG 2021-05-26 20:19:48,237 [shard 0] osd -  pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering  got osd.0 2.7( DNE empty local-lis/les=0/0 n=0 ec=0/0 lis/c=0/0 les/c/f=0/0/0 sis=0)
DEBUG 2021-05-26 20:19:48,237 [shard 0] osd -  pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering state<Started/Primary/Peering/GetInfo>: Adding osd: 0 peer features: 
3f01cfbb7ffdffff 
DEBUG 2021-05-26 20:19:48,237 [shard 0] osd -  pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering state<Started/Primary/Peering/GetInfo>: Common peer features: 
3f01cfbb7ffdffff 
DEBUG 2021-05-26 20:19:48,237 [shard 0] osd -  pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering state<Started/Primary/Peering/GetInfo>: Common acting features: 
3f01cfbb7ffdffff 
DEBUG 2021-05-26 20:19:48,238 [shard 0] osd -  pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering state<Started/Primary/Peering/GetInfo>: Common upacting features: 
3f01cfbb7ffdffff 
DEBUG 2021-05-26 20:19:48,238 [shard 0] osd -  pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering exit Started/Primary/Peering/GetInfo 0.099480 4 2021-05-26T20:19:48.146172+0000
...
DEBUG 2021-05-26 20:19:48,238 [shard 0] osd -  pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering enter Started/Primary/Peering/GetLog
...
DEBUG 2021-05-26 20:19:48,238 [shard 0] osd -  pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering enter Started/Primary/Peering/GetMissing
...
DEBUG 2021-05-26 20:19:48,238 [shard 0] osd -  pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering enter Started/Primary/Peering/WaitUpThru
...
DEBUG 2021-05-26 20:19:49,139 [shard 0] osd -  pg_epoch 15 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating enter Started/Primary/Active
...
DEBUG 2021-05-26 20:19:49,142 [shard 0] osd -  pg_epoch 15 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+activating enter Started/Primary/Active/Activating
...
DEBUG 2021-05-26 20:19:49,204 [shard 0] osd -  pg_epoch 15 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/0 les/c/f=15/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 active enter Started/Primary/Active/Recovered
...
DEBUG 2021-05-26 20:19:49,204 [shard 0] osd -  pg_epoch 15 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/0 les/c/f=15/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 active enter Started/Primary/Active/Clean
...
DEBUG 2021-05-26 20:22:31,223 [shard 0] osd -  pg_epoch 86 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 active enter Reset
...
<a lot of flipping>
...
DEBUG 2021-05-26 20:24:07,851 [shard 0] osd -  pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163
) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown activate_map
DEBUG 2021-05-26 20:24:07,851 [shard 0] osd -  pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163
) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown exit Reset 0.035744 1 2021-05-26T20:24:07.817331+0000
INFO  2021-05-26 20:24:07,851 [shard 0] osd - Exiting state: Reset, entered at 
1622060647 .
8158188 , 
1622060647 .
8173316  spent on 1 events
DEBUG 2021-05-26 20:24:07,851 [shard 0] osd -  pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163
) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown enter Started
INFO  2021-05-26 20:24:07,851 [shard 0] osd - Entering state: Started
DEBUG 2021-05-26 20:24:07,851 [shard 0] osd -  pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163
) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown enter Start
INFO  2021-05-26 20:24:07,851 [shard 0] osd - Entering state: Start
INFO  2021-05-26 20:24:07,851 [shard 0] osd -  pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163
) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown state<Start>: transitioning to Primary
DEBUG 2021-05-26 20:24:07,851 [shard 0] osd -  pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163
) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown exit Start 0.000041 0 0.000000
INFO  2021-05-26 20:24:07,851 [shard 0] osd - Exiting state: Start, entered at 
1622060647 .
8516333 , 0.0 spent on 0 events
DEBUG 2021-05-26 20:24:07,852 [shard 0] osd -  pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163
) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown enter Started/Primary
INFO  2021-05-26 20:24:07,852 [shard 0] osd - Entering state: Started/Primary
DEBUG 2021-05-26 20:24:07,852 [shard 0] osd -  pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163
) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown enter Started/Primary/Peering
INFO  2021-05-26 20:24:07,852 [shard 0] osd - Entering state: Started/Primary/Peering
DEBUG 2021-05-26 20:24:07,852 [shard 0] osd -  pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering enter Started/Primary/Peering/GetInfo
INFO  2021-05-26 20:24:07,852 [shard 0] osd - Entering state: Started/Primary/Peering/GetInfo
...
DEBUG 2021-05-26 20:24:07,852 [shard 0] osd -  pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering build_prior all_probe 0,1,4
DEBUG 2021-05-26 20:24:07,852 [shard 0] osd -  pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering build_prior maybe_rw interval:139, acting: 0
DEBUG 2021-05-26 20:24:07,852 [shard 0] osd -  pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering build_prior final: probe 0,1,4 down  blocked_by {}
DEBUG 2021-05-26 20:24:07,852 [shard 0] osd -  pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering up_thru 125 < same_since 163, must notify monitor
DEBUG 2021-05-26 20:24:07,852 [shard 0] osd -  pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering state<Started/Primary/Peering/GetInfo>:  no prior_set down osds, clearing prior_readable_until_ub
DEBUG 2021-05-26 20:24:07,852 [shard 0] osd -  pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering state<Started/Primary/Peering/GetInfo>:  querying info from osd.0
DEBUG 2021-05-26 20:24:07,852 [shard 0] osd -  pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering state<Started/Primary/Peering/GetInfo>:  querying info from osd.4
...
DEBUG 2021-05-26 20:24:07,924 [shard 0] ms - [osd.1(cluster) v2:172.21.15.39:6803/34727@61064 >> osd.4 v2:172.21.15.62:6802/34686] connect to existing
DEBUG 2021-05-26 20:24:07,924 [shard 0] ms - [osd.1(cluster) v2:172.21.15.39:6803/34727@61064 >> osd.4 v2:172.21.15.62:6802/34686] --> #62 === pg_query2(2.7 2.7 query(info 0'0 epoch_sent 163) e163/163) v1 (131)
...
DEBUG 2021-05-26 20:24:07,942 [shard 0] ms - [osd.1(cluster) v2:172.21.15.39:6803/34727@61064 >> osd.4 v2:172.21.15.62:6802/34686] GOT AckFrame: seq=62
...
<plenty of osd_ping messanging but no reply to the pg_query for 2.7>
...
DEBUG 2021-05-26 20:58:19,829 [shard 0] ms - [osd.1(hb_front) v2:172.21.15.39:6807/34727 >> osd.4 v2:172.21.15.62:6807/34686@54816] <== #772 =
== osd_ping(ping e17 up_from 10 ping_stamp 2021-05-26T20:58:19.825573+0000/2319.780029297s send_stamp 2319.780029297s) v5 (70)
DEBUG 2021-05-26 20:58:19,829 [shard 0] ms - [osd.1(hb_front) v2:172.21.15.39:6807/34727 >> osd.4 v2:172.21.15.62:6807/34686@54816] --> #772 === osd_ping(ping_reply e249 up_from 10 ping_stamp 2021-05-26T20:58:19.825573+0000/2319.780029297s send_stamp 2320.039062500s) v5 (70
```
The peering request got stuck due to awaiting for `OSDMap`.
```
DEBUG 2021-05-26 20:24:07,930 [shard 0] ms - [osd.4(cluster) v2:172.21.15.62:6802/34686 >> osd.1 v2:172.21.15.39:6803/34727@61064] <== #62 === pg_query2(2.7 2.7 query(info 0'0 epoch_sent 163) e163/163) v1 (131)
DEBUG 2021-05-26 20:24:07,930 [shard 0] osd - handle_peering_op on 2.7 from 1
DEBUG 2021-05-26 20:24:07,930 [shard 0] osd - peering_event(id=517, detail=PeeringEvent(from=1 pgid=2.7 sent=163 requested=163 evt=epoch_sent: 163 epoch_requested: 163 MQuery 2.7 from 1 query_epoch 163 query: query(info 0'0 epoch_sent 163))): star
```
```
INFO  2021-05-26 20:19:49,127 [shard 0] osd - evt epoch is 15, i have 14, will wait
INFO  2021-05-26 20:19:49,128 [shard 0] osd - osdmap_subscribe(14)
DEBUG 2021-05-26 20:19:49,128 [shard 0] ms - [osd.4(client) v2:172.21.15.62:6801/34686@63208 >> mon.1 v2:172.21.15.62:3300/0] --> #9 === mon_s
ubscribe({osdmap=14}) v3 (15)
...
INFO  2021-05-26 20:19:49,131 [shard 0] osd - handle_osd_map osd_map(14..15 src has 1..15) v4
INFO  2021-05-26 20:19:49,131 [shard 0] osd - handle_osd_map epochs [14..15], i have 15, src has [1..15]
...
INFO  2021-05-26 20:19:49,138 [shard 0] osd - handle_osd_map osd_map(14..15 src has 1..15) v4
INFO  2021-05-26 20:19:49,138 [shard 0] osd - handle_osd_map epochs [14..15], i have 15, src has [1..15]
...
INFO  2021-05-26 20:19:49,139 [shard 0] osd - evt epoch is 15, i have 14, will wait
INFO  2021-05-26 20:19:49,141 [shard 0] osd - osdmap_subscribe(14)
WARN  2021-05-26 20:19:49,141 [shard 0] monc - renew_subs - empty
...
INFO  2021-05-26 20:19:50,140 [shard 0] osd - handle_osd_map osd_map(15..16 src has 1..16) v4
INFO  2021-05-26 20:19:50,140 [shard 0] osd - handle_osd_map epochs [15..16], i have 15, src has [1..16]
DEBUG 2021-05-26 20:19:50,141 [shard 0] bluestore - do_transaction
INFO  2021-05-26 20:19:50,145 [shard 0] osd - osd.4: committed_osd_maps(16, 16)
...
INFO  2021-05-26 20:20:42,881 [shard 0] osd - handle_osd_map epochs [16..17], i have 16, src has [1..17]
DEBUG 2021-05-26 20:20:42,882 [shard 0] bluestore - do_transaction
INFO  2021-05-26 20:20:42,886 [shard 0] osd - osd.4: committed_osd_maps(17, 17)
...
INFO  2021-05-26 20:20:43,941 [shard 0] osd - evt epoch is 18, i have 17, will wait
INFO  2021-05-26 20:20:43,941 [shard 0] osd - osdmap_subscribe(17)
...
INFO  2021-05-26 20:20:43,957 [shard 0] osd - evt epoch is 18, i have 17, will wait
INFO  2021-05-26 20:20:43,957 [shard 0] osd - osdmap_subscribe(17)
...
INFO  2021-05-26 20:20:43,969 [shard 0] osd - evt epoch is 18, i have 17, will wait
INFO  2021-05-26 20:20:43,969 [shard 0] osd - osdmap_subscribe(17)
...
DEBUG 2021-05-26 20:20:46,930 [shard 0] ms - [osd.4(client) v2:172.21.15.62:6801/34686@57288 >> mon.2 v2:172.21.15.39:3301/0] <== #4 === osd_m
ap(20..21 src has 1..21) v4 (41)
INFO  2021-05-26 20:20:46,930 [shard 0] osd - handle_osd_map osd_map(20..21 src has 1..21) v4
INFO  2021-05-26 20:20:46,930 [shard 0] osd - handle_osd_map epochs [20..21], i have 17, src has [1..21]
INFO  2021-05-26 20:20:46,930 [shard 0] osd - handle_osd_map message skips epochs 18..19
INFO  2021-05-26 20:20:46,930 [shard 0] osd - osdmap_subscribe(18)
...
DEBUG 2021-05-26 20:20:47,936 [shard 0] ms - [osd.4(client) v2:172.21.15.62:6801/34686@57288 >> mon.2 v2:172.21.15.39:3301/0] <== #5 === osd_m
ap(21..22 src has 1..22) v4 (41)
INFO  2021-05-26 20:20:47,936 [shard 0] osd - handle_osd_map osd_map(21..22 src has 1..22) v4
INFO  2021-05-26 20:20:47,936 [shard 0] osd - handle_osd_map epochs [21..22], i have 17, src has [1..22]
INFO  2021-05-26 20:20:47,936 [shard 0] osd - handle_osd_map message skips epochs 18..20
INFO  2021-05-26 20:20:47,936 [shard 0] osd - osdmap_subscribe(18)
...
<osdmap_subscribe(18) over and over>
```
```
2021-05-26T20:19:42.048+0000 
7f4712ffd700   1 -- [v2:172.21.15.62:3300/0,v1:172.21.15.62:6789/0] <== osd.4 v2:172.21.15.62:6801/34686 4 ==== mon_subscribe({mgrmap=0+,osd_pg_creates=0+,osdmap=0+}) v3 ==== 82+0+0 (secure 0 0 0) 0x7f46fc04e150 con 0x7f470401c480
2021-05-26T20:19:42.048+0000 
7f4712ffd700  20 mon.b@1(peon) e1 _ms_dispatch existing session 0x7f46fc02f500 for osd.4
2021-05-26T20:19:42.048+0000 
7f4712ffd700  20 mon.b@1(peon) e1  entity_name osd.4 global_id 4168 (new_ok) caps allow *
2021-05-26T20:19:42.048+0000 
7f4712ffd700  10 mon.b@1(peon) e1 handle_subscribe mon_subscribe({mgrmap=0+,osd_pg_creates=0+,osdmap=0+}) v3
...
2021-05-26T20:19:49.129+0000 
7f4712ffd700   1 -- [v2:172.21.15.62:3300/0,v1:172.21.15.62:6789/0] <== osd.4 v2:172.21.15.62:6801/34686 9 ==== mo
n_subscribe({osdmap=14}) v3 ==== 36+0+0 (secure 0 0 0) 0x7f46e8556210 con 0x7f470401c480
2021-05-26T20:19:49.129+0000 
7f4712ffd700  20 mon.b@1(peon) e1 _ms_dispatch existing session 0x7f46fc02f500 for osd.4
2021-05-26T20:19:49.129+0000 
7f4712ffd700  20 mon.b@1(peon) e1  entity_name osd.4 global_id 4168 (new_ok) caps allow *
2021-05-26T20:19:49.129+0000 
7f4712ffd700  10 mon.b@1(peon) e1 handle_subscribe mon_subscribe({osdmap=14}) v3
2021-05-26T20:19:49.129+0000 
7f4712ffd700  20 is_capable service=mon command= read addr v2:172.21.15.62:6801/34686 on cap allow *
2021-05-26T20:19:49.129+0000 
7f4712ffd700  20  allow so far , doing grant allow *
2021-05-26T20:19:49.129+0000 
7f4712ffd700  20  allow all
2021-05-26T20:19:49.129+0000 
7f4712ffd700  20 is_capable service=osd command= read addr v2:172.21.15.62:6801/34686 on cap allow *
2021-05-26T20:19:49.129+0000 
7f4712ffd700  20  allow so far , doing grant allow *
2021-05-26T20:19:49.129+0000 
7f4712ffd700  20  allow all
2021-05-26T20:19:49.129+0000 
7f4712ffd700  10 mon.b@1(peon).osd e15 check_osdmap_sub 0x7f46e84f0150 next 14 (onetime)
2021-05-26T20:19:49.129+0000 
7f4712ffd700   5 mon.b@1(peon).osd e15 send_incremental [14..15] to osd.4
2021-05-26T20:19:49.129+0000 
7f4712ffd700  10 mon.b@1(peon).osd e15 build_incremental [14..15] with features 
3f01cfbb7ffdffff 
2021-05-26T20:19:49.129+0000 
7f4712ffd700  20 mon.b@1(peon).osd e15 build_incremental    inc 15 622 bytes
2021-05-26T20:19:49.129+0000 
7f4712ffd700  20 mon.b@1(peon).osd e15 build_incremental    inc 14 578 bytes
2021-05-26T20:19:49.129+0000 
7f4712ffd700   1 -- [v2:172.21.15.62:3300/0,v1:172.21.15.62:6789/0] --> v2:172.21.15.62:6801/34686 -- osd_map(14..
15 src has 1..15) v4 -- 0x7f46e856a100 con 0x7f470401c480
```
```
seastar::future<> Client::renew_subs()
{
  if (!sub.have_new()) {
    logger().warn("{} - empty", __func__);
    return seastar::now();
  }
  logger().trace("{}", __func__);
  auto m = crimson::make_message<MMonSubscribe>();
  m->what = sub.get_subs();
  m->hostname = ceph_get_short_hostname();
  return send_message(std::move(m)).then([this] {
    sub.renewed();
  });
}
```
```
INFO  2021-05-26 20:19:42,081 [shard 0] osd - osdmap_subscribe(1)
DEBUG 2021-05-26 20:19:42,081 [shard 0] ms - [osd.4(client) v2:172.21.15.62:6801/34686@63208 >> mon.1 v2:172.21.15.62:3300/0] --> #6 === mon_s
ubscribe({osdmap=1}) v3 (15)
...
INFO  2021-05-26 20:19:49,128 [shard 0] osd - osdmap_subscribe(14)
DEBUG 2021-05-26 20:19:49,128 [shard 0] ms - [osd.4(client) v2:172.21.15.62:6801/34686@63208 >> mon.1 v2:172.21.15.62:3300/0] --> #9 === mon_subscribe({osdmap=14}) v3 (15)
...
INFO  2021-05-26 20:19:49,141 [shard 0] osd - osdmap_subscribe(14)
WARN  2021-05-26 20:19:49,141 [shard 0] monc - renew_subs - empty
<no MMonSubcribe>
...
INFO  2021-05-26 20:20:43,941 [shard 0] osd - evt epoch is 18, i have 17, will wait
INFO  2021-05-26 20:20:43,941 [shard 0] osd - osdmap_subscribe(17)
<no MMonSubcribe>
...
INFO  2021-05-26 20:20:46,930 [shard 0] osd - handle_osd_map message skips epochs 18..19
INFO  2021-05-26 20:20:46,930 [shard 0] osd - osdmap_subscribe(18)
<no MMonSubcribe>
```
[1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-26_12:20:26-rados-master-distro-basic-smithi/
6136908 
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com> 
Ernesto Puerta  [Wed, 2 Jun 2021 12:12:56 +0000  (14:12 +0200)] 
Merge pull request #41630 from rhcs-dashboard/fix-bucket-calculationsReviewed-by: Alfonso MartÃnez <almartin@redhat.com> Reviewed-by: Ernesto Puerta <epuertat@redhat.com> 
Kefu Chai  [Wed, 2 Jun 2021 10:43:47 +0000  (18:43 +0800)] 
Merge pull request #41638 from tchaikov/wip-doc-crimson-docReviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com> 
Kefu Chai  [Wed, 2 Jun 2021 09:10:25 +0000  (17:10 +0800)] 
doc/dev/crimson: update link to scylladb debugging tipsSigned-off-by: Kefu Chai <kchai@redhat.com> 
Kefu Chai  [Wed, 2 Jun 2021 09:00:53 +0000  (17:00 +0800)] 
Merge pull request #41637 from tchaikov/wip-crimson-never-discard-futureReviewed-by: Xuehan Xu <xuxuehan@360.cn> Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com> 
Kefu Chai  [Tue, 1 Jun 2021 11:58:47 +0000  (19:58 +0800)] 
doc/mgr/modules: add a "debugging" sectionSigned-off-by: Kefu Chai <kchai@redhat.com> 
Kefu Chai  [Tue, 1 Jun 2021 12:40:16 +0000  (20:40 +0800)] 
pybind/ceph_mgr_repl: define "timeout" opt as an intSigned-off-by: Kefu Chai <kchai@redhat.com> 
Avan Thakkar  [Tue, 1 Jun 2021 14:21:16 +0000  (19:51 +0530)] 
mgr/dashboard: fix bucket objects and size calculationsFixes: https://tracker.ceph.com/issues/51035 Signed-off-by: Avan Thakkar <athakkar@redhat.com> 
Kefu Chai  [Wed, 2 Jun 2021 06:16:25 +0000  (14:16 +0800)] 
crimson/common/interruptible_future: mark future 'nodiscard'Signed-off-by: Kefu Chai <kchai@redhat.com> 
Kefu Chai  [Wed, 2 Jun 2021 06:15:43 +0000  (14:15 +0800)] 
crimson/common/errorator: mark errorator::future 'nodiscard'Signed-off-by: Kefu Chai <kchai@redhat.com> 
Kefu Chai  [Wed, 2 Jun 2021 06:13:04 +0000  (14:13 +0800)] 
crimson: always handle returned future
to ignore a future without good reason could lead to catastrophic
issues. see also 
b127fa3cdd405c71cf09875f61f107c23af6b8cf 
Signed-off-by: Kefu Chai <kchai@redhat.com> 
Kefu Chai  [Wed, 2 Jun 2021 06:11:07 +0000  (14:11 +0800)] 
crimson/os: do not return a future in finally()Signed-off-by: Kefu Chai <kchai@redhat.com> 
Yuval Lifshitz  [Wed, 2 Jun 2021 04:47:39 +0000  (07:47 +0300)] 
Merge pull request #41026 from TRYTOBE8TME/wip-rgw-rabbitmq
Patrick Donnelly  [Tue, 1 Jun 2021 21:00:23 +0000  (14:00 -0700)] 
qa: increase fragmentation to improve uniform distributionFixes: https://tracker.ceph.com/issues/51060 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> 
Ilya Dryomov  [Tue, 1 Jun 2021 19:56:57 +0000  (21:56 +0200)] 
Merge pull request #41588 from idryomov/wip-rbd-trash-purgeReviewed-by: Mykola Golub <mgolub@suse.com> 
Kalpesh  [Tue, 20 Apr 2021 09:14:04 +0000  (14:44 +0530)] 
qa/tasks: Adding RabbitMQ task for bucket notification testsSigned-off-by: Kalpesh Pandya <kapandya@redhat.com> 
Ernesto Puerta  [Tue, 1 Jun 2021 17:29:20 +0000  (19:29 +0200)] 
Merge pull request #41421 from s0nea/wip-dashboard-rbd-partially-rmReviewed-by: Alfonso MartÃnez <almartin@redhat.com> Reviewed-by: Avan Thakkar <athakkar@redhat.com> Reviewed-by: Laura Paduano <lpaduano@suse.com> Reviewed-by: Ilya Dryomov <idryomov@redhat.com> Reviewed-by: Tatjana Dehler <tdehler@suse.com> Reviewed-by: Mykola Golub <mgolub@suse.com> Reviewed-by: Volker Theile <vtheile@suse.com> 
Samuel Just  [Tue, 1 Jun 2021 15:53:31 +0000  (08:53 -0700)] 
Merge pull request #41606 from liu-chunmei/seastore-fix-trackerReviewed-by: Samuel Just <sjust@redhat.com> Reviewed-by: Kefu Chai <kchai@redhat.com> 
Ilya Dryomov  [Tue, 1 Jun 2021 15:48:12 +0000  (17:48 +0200)] 
Merge pull request #41616 from idryomov/wip-rbd-qemu-precise-reposReviewed-by: Deepika Upadhyay <dupadhya@redhat.com> 
Kefu Chai  [Tue, 1 Jun 2021 15:43:59 +0000  (23:43 +0800)] 
Merge pull request #41605 from t-msn/update-podman-detectionReviewed-by: Kefu Chai <kchai@redhat.com> 
Kefu Chai  [Tue, 1 Jun 2021 15:00:57 +0000  (23:00 +0800)] 
Merge pull request #41369 from ifed01/wip-ifed-fix-avl-enospc2Reviewed-by: Adam Kupczyk <akupczyk@redhat.com> Reviewed-by: Kefu Chai <kchai@redhat.com> 
Ernesto Puerta  [Tue, 1 Jun 2021 14:28:48 +0000  (16:28 +0200)] 
Merge pull request #41395 from rhcs-dashboard/fix-50855-masterReviewed-by: Aashish Sharma <aasharma@redhat.com> Reviewed-by: Alfonso MartÃnez <almartin@redhat.com> Reviewed-by: Avan Thakkar <athakkar@redhat.com> Reviewed-by: Ernesto Puerta <epuertat@redhat.com> Reviewed-by: Nizamudeen A <nia@redhat.com> 
Ernesto Puerta  [Tue, 1 Jun 2021 14:28:07 +0000  (16:28 +0200)] 
Merge pull request #41598 from rhcs-dashboard/fix-51026-masterReviewed-by: Avan Thakkar <athakkar@redhat.com> Reviewed-by: Ernesto Puerta <epuertat@redhat.com> Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com> Reviewed-by: Nizamudeen A <nia@redhat.com> 
Ernesto Puerta  [Tue, 1 Jun 2021 14:26:30 +0000  (16:26 +0200)] 
Merge pull request #41184 from rhcs-dashboard/fix-base-hrefReviewed-by: Aashish Sharma <aasharma@redhat.com> Reviewed-by: Avan Thakkar <athakkar@redhat.com> Reviewed-by: Ernesto Puerta <epuertat@redhat.com> 
Sage Weil  [Tue, 1 Jun 2021 13:46:33 +0000  (09:46 -0400)] 
Merge PR #41601 into masterReviewed-by: Mike Perez <miperez@redhat.com> 
Igor Fedotov  [Mon, 17 May 2021 19:23:26 +0000  (22:23 +0300)] 
os/bluestore: fix unexpected ENOSPC in Avl/Hybrid allocators.Fixes: https://tracker.ceph.com/issues/50656 Signed-off-by: Igor Fedotov <ifedotov@suse.com> 
Casey Bodley  [Tue, 1 Jun 2021 12:28:04 +0000  (08:28 -0400)] 
Merge pull request #41470 from a16bitsysop/rgw_string.hReviewed-by: Kefu Chai <kchai@redhat.com> Reviewed-by: Casey Bodley <cbodley@redhat.com> 
Kefu Chai  [Tue, 1 Jun 2021 11:43:34 +0000  (19:43 +0800)] 
Merge pull request #41591 from tchaikov/wip-mgr-selftest-replReviewed-by: Ernesto Puerta <epuertat@redhat.com> 
Kefu Chai  [Tue, 1 Jun 2021 11:28:06 +0000  (19:28 +0800)] 
Merge pull request #41603 from rzarzynski/wip-crimson-fix-use-after-free-alienstore-get_attrReviewed-by: Kefu Chai <kchai@redhat.com> 
Ilya Dryomov  [Tue, 1 Jun 2021 10:46:32 +0000  (12:46 +0200)] 
qa/tasks/qemu: precise repos have been archivedFixes: https://tracker.ceph.com/issues/51033 Signed-off-by: Ilya Dryomov <idryomov@gmail.com> 
Sebastian Wagner  [Tue, 1 Jun 2021 09:30:10 +0000  (11:30 +0200)] 
Merge pull request #41595 from zdover23/wip-doc-cephadm-serv-man-daemon-status-2021-05-30Reviewed-by: Sebastian Wagner <sewagner@redhat.com> 
Sebastian Wagner  [Tue, 1 Jun 2021 09:29:12 +0000  (11:29 +0200)] 
Merge pull request #41608 from zdover23/wip-doc-cephadm-serv-man-service-spec-2021-05-30Reviewed-by: Sebastian Wagner <sewagner@redhat.com> 
Radoslaw Zarzynski  [Mon, 31 May 2021 23:37:04 +0000  (23:37 +0000)] 
crimson/os: fix formatting in AlienStore::get_attr().Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com> 
Radoslaw Zarzynski  [Mon, 31 May 2021 22:05:25 +0000  (22:05 +0000)] 
crimson/os: fix use-after-free in AlienStore::get_attr().
The `FuturizedStore` interface imposes the `get_attr()`
takes the `name` parameter as `std::string_view`, and
thus burdens implementations with extending the life-
time of the data the instance refers to.
Unfortunately, `AlienStore` is unaware that prolonging
the life of a `std::string_view` instance doesn't prolong
the data memory it points to. This problem has manifested
in the following use-after-free detected at Sepia:
```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-05-26_12:20:26-rados-master-distro-basic-smithi/
6136929 $ less ./remote/smithi194/log/ceph-osd.7.log.gz
...
DEBUG 2021-05-26 20:24:54,077 [shard 0] osd - do_osd_ops_execute: object 14:
55e1a5b4 :test-rados-api-smithi067-38889-2::foo:head - handling op
call
DEBUG 2021-05-26 20:24:54,077 [shard 0] osd - handling op call on object 14:
55e1a5b4 :test-rados-api-smithi067-38889-2::foo:head
DEBUG 2021-05-26 20:24:54,078 [shard 0] osd - calling method lock.lock, num_read=0, num_write=0
DEBUG 2021-05-26 20:24:54,078 [shard 0] osd - handling op getxattr on object 14:
55e1a5b4 :test-rados-api-smithi067-38889-2::foo:head
DEBUG 2021-05-26 20:24:54,078 [shard 0] osd - getxattr on obj=14:
55e1a5b4 :test-rados-api-smithi067-38889-2::foo:head for attr=_lock.TestLockPP1
DEBUG 2021-05-26 20:24:54,078 [shard 0] bluestore - get_attr
=================================================================
==34068==ERROR: AddressSanitizer: heap-use-after-free on address 0x6030001851d0 at pc 0x7f824d6a5b27 bp 0x7f822b4201c0 sp 0x7f822b41f968
READ of size 17 at 0x6030001851d0 thread T28 (alien-store-tp)
...
    #0 0x7f824d6a5b26  (/lib64/libasan.so.5+0x40b26)
    #1 0x55e2cbb2e00b  (/usr/bin/ceph-osd+0x2b6dc00b)
    #2 0x55e2d31f086e  (/usr/bin/ceph-osd+0x32d9e86e)
    #3 0x55e2d3467607 in crimson::os::ThreadPool::loop(std::chrono::duration<long, std::ratio<1l, 1000l> >, unsigned long) (/usr/bin/ceph-osd+0x33015607)
    #4 0x55e2d346b14a  (/usr/bin/ceph-osd+0x3301914a)
    #5 0x7f8249d32ba2  (/lib64/libstdc++.so.6+0xc2ba2)
    #6 0x7f824a00d149 in start_thread (/lib64/libpthread.so.0+0x8149)
    #7 0x7f82486edf22 in clone (/lib64/libc.so.6+0xfcf22)
0x6030001851d0 is located 0 bytes inside of 31-byte region [0x6030001851d0,0x6030001851ef)
freed by thread T0 here:
    #0 0x7f824d757688 in operator delete(void*) (/lib64/libasan.so.5+0xf2688)
previously allocated by thread T0 here:
    #0 0x7f824d7567b0 in operator new(unsigned long) (/lib64/libasan.so.5+0xf17b0)
Thread T28 (alien-store-tp) created by T0 here:
    #0 0x7f824d6b7ea3 in __interceptor_pthread_create (/lib64/libasan.so.5+0x52ea3)
SUMMARY: AddressSanitizer: heap-use-after-free (/lib64/libasan.so.5+0x40b26)
Shadow bytes around the buggy address:
  0x0c06800289e0: fd fd fd fa fa fa fd fd fd fa fa fa 00 00 00 fa
  0x0c06800289f0: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
  0x0c0680028a00: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa
  0x0c0680028a10: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa
  0x0c0680028a20: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
=>0x0c0680028a30: fd fd fa fa fd fd fd fd fa fa[fd]fd fd fd fa fa
  0x0c0680028a40: fd fd fd fd fa fa fd fd fd fd fa fa 00 00 00 07
  0x0c0680028a50: fa fa 00 00 00 fa fa fa 00 00 00 fa fa fa fd fd
  0x0c0680028a60: fd fd fa fa fd fd fd fd fa fa fd fd fd fd fa fa
  0x0c0680028a70: 00 00 00 00 fa fa fd fd fd fd fa fa fd fd fd fd
  0x0c0680028a80: fa fa fd fd fd fd fa fa fd fd fd fd fa fa fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==34068==ABORTING
```
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com> 
Tatjana Dehler  [Thu, 27 May 2021 09:46:50 +0000  (11:46 +0200)] 
mgr/dashboard: show partially deleted RBDsFixes: https://tracker.ceph.com/issues/48603 Signed-off-by: Tatjana Dehler <tdehler@suse.com> 
Kefu Chai  [Tue, 1 Jun 2021 08:18:00 +0000  (16:18 +0800)] 
Merge pull request #41607 from liu-chunmei/seastore-cleanup-lba-get-mappingReviewed-by: Kefu Chai <kchai@redhat.com> 
Kefu Chai  [Tue, 1 Jun 2021 08:17:07 +0000  (16:17 +0800)] 
Merge pull request #41597 from rhcs-dashboard/remove-promtool-scriptReviewed-by: Willem Jan Withagen <wjw@digiware.nl> Reviewed-by: Kefu Chai <kchai@redhat.com> 
chunmei-liu  [Tue, 1 Jun 2021 06:44:57 +0000  (23:44 -0700)] 
crimson/seastore: cleanup lba manager get_mappingsSigned-off-by: chunmei-liu <chunmei.liu@intel.com> 
chunmei-liu  [Tue, 1 Jun 2021 05:54:55 +0000  (22:54 -0700)] 
crimson/seastore: fix assert in read_extentSigned-off-by: chunmei-liu <chunmei.liu@intel.com> 
Aashish Sharma  [Wed, 26 May 2021 07:08:33 +0000  (12:38 +0530)] 
test,cmake:remove run-promtool-unitests.sh scriptSigned-off-by: Aashish Sharma <aasharma@redhat.com> 
Aashish Sharma  [Tue, 1 Jun 2021 05:09:24 +0000  (10:39 +0530)] 
mgr/dashboard: API Version changes do not apply to pre-defined methods (list, create etc.)Fixes: https://tracker.ceph.com/issues/50855 Signed-off-by: Aashish Sharma <aasharma@redhat.com> 
Kefu Chai  [Sat, 29 May 2021 17:10:25 +0000  (01:10  +0800)] 
pybind/mgr/selftest: add "mgr self-test eval" commandSigned-off-by: Kefu Chai <kchai@redhat.com> 
Mykola Golub  [Mon, 31 May 2021 16:34:53 +0000  (19:34 +0300)] 
Merge pull request #41514 from ideepika/wip-49592-upgradeReviewed-by: Kefu Chai <kchai@redhat.com> Reviewed-by: Mykola Golub <mgolub@suse.com> 
Sage Weil  [Mon, 31 May 2021 16:26:01 +0000  (11:26 -0500)] 
doc/foundation: remove amihanSigned-off-by: Sage Weil <sage@newdream.net> 
Ernesto Puerta  [Mon, 31 May 2021 11:45:40 +0000  (13:45 +0200)] 
mgr/dashboard: pass Grafana datasource in URLFixes: https://tracker.ceph.com/issues/51026 Signed-off-by: Ernesto Puerta <epuertat@redhat.com> 
Kefu Chai  [Mon, 31 May 2021 12:07:33 +0000  (20:07 +0800)] 
Merge pull request #41589 from tchaikov/wip-crimson-start-up-errorReviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com> 
Kefu Chai  [Sat, 29 May 2021 08:24:59 +0000  (16:24 +0800)] 
crimson/os/alienstore: do not cleanup if not startedSigned-off-by: Kefu Chai <kchai@redhat.com> 
Kefu Chai  [Sat, 29 May 2021 08:03:50 +0000  (16:03 +0800)] 
crimson/os/alienstore: create tp in AlienStore::start()Signed-off-by: Kefu Chai <kchai@redhat.com> 
Kefu Chai  [Sat, 29 May 2021 07:08:18 +0000  (15:08 +0800)] 
crimson/osd/main: always stop osd as long as it startedSigned-off-by: Kefu Chai <kchai@redhat.com> 
Kefu Chai  [Sat, 29 May 2021 07:03:01 +0000  (15:03 +0800)] 
crimson/osd/main: do cleanup using defer()Signed-off-by: Kefu Chai <kchai@redhat.com> 
Kefu Chai  [Sat, 29 May 2021 06:51:09 +0000  (14:51 +0800)] 
crimson/osd/main: catch exception thrown in the async() callSigned-off-by: Kefu Chai <kchai@redhat.com> 
Misono Tomohiro  [Mon, 31 May 2021 11:58:53 +0000  (20:58 +0900)] 
vstart: update podman detectionSigned-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> 
Deepika  [Mon, 24 May 2021 21:20:39 +0000  (21:20 +0000)] 
qa/upgrade: conditionally disable update_features tests
with the recent support for async rbd operations from pacific+ when an
    older client(non async support) goes on upgrade, and simultaneously
    interacts with a newer client which expects the requests to be async,
    experiences hang; considering the return code for request completion to
    be acknowledgement for async request, which then keeps waiting for
    another acknowledgement of request completion.
    this if happens should be a rare only when lockowner is an old client
    and should be deferred if compatibility issues arises.
see also: 
541230475d3b25ab18c4eb9bc5011060462594a6 (octopus)
Signed-off-by: Deepika <dupadhya@redhat.com> 
Ilya Dryomov  [Wed, 26 May 2021 12:21:22 +0000  (14:21 +0200)] 
librbd: don't stop at the first unremovable image when purgingFixes: https://tracker.ceph.com/issues/51021 Signed-off-by: Ilya Dryomov <idryomov@gmail.com> 
Ilya Dryomov  [Wed, 26 May 2021 12:21:22 +0000  (14:21 +0200)] 
rbd: combined error message for expected Trash::purge() errorsSigned-off-by: Ilya Dryomov <idryomov@gmail.com> 
Zac Dover  [Mon, 31 May 2021 04:15:56 +0000  (14:15 +1000)] 
doc/cephadm: enriching "Service Specification"Signed-off-by: Zac Dover <zac.dover@gmail.com> 
Zac Dover  [Mon, 31 May 2021 03:55:20 +0000  (13:55 +1000)] 
doc/cephadm: enriching "daemon status"Signed-off-by: Zac Dover <zac.dover@gmail.com> 
Kefu Chai  [Mon, 31 May 2021 01:40:50 +0000  (09:40 +0800)] 
Merge pull request #41552 from tchaikov/wip-mgr-find-rootsReviewed-by: Avan Thakkar <athakkar@redhat.com> 
J. Eric Ivancich  [Sat, 29 May 2021 16:18:45 +0000  (12:18 -0400)] 
Merge pull request #41563 from cybozu/rgw-add-the-description-of-blocking-io-during-index-reshardingReviewed-by: Matt Benjamin mbenjamin@redhat.com Reviewed-by: J. Eric Ivancich <ivancich@redhat.com> 
Kefu Chai  [Sat, 29 May 2021 06:48:11 +0000  (14:48 +0800)] 
crimson/osd/main: handle and rethrow exception in fetch_config()Signed-off-by: Kefu Chai <kchai@redhat.com> 
Kefu Chai  [Sat, 29 May 2021 05:45:41 +0000  (13:45 +0800)] 
test/crimson/test_messenger: add editor variables in headerSigned-off-by: Kefu Chai <kchai@redhat.com> 
Kefu Chai  [Sat, 29 May 2021 05:44:29 +0000  (13:44 +0800)] 
crimson/osd/main: do cleanup using defer() in fetch_config()Signed-off-by: Kefu Chai <kchai@redhat.com> 
Kefu Chai  [Sat, 29 May 2021 03:52:45 +0000  (11:52 +0800)] 
vstart.sh: remove unused variableSigned-off-by: Kefu Chai <kchai@redhat.com> 
Igor Fedotov  [Mon, 17 May 2021 19:21:53 +0000  (22:21 +0300)] 
test/allocator_replay_test: make allocator type configurableSigned-off-by: Igor Fedotov <ifedotov@suse.com> 
Kefu Chai  [Sat, 29 May 2021 02:42:14 +0000  (10:42 +0800)] 
Merge pull request #41278 from sebastian-philipp/mgr-cephadm-set-user-no-hostsReviewed-by: Juan Miguel Olmo <jolmomar@redhat.com> Reviewed-by: Adam King <adking@redhat.com> 
Kefu Chai  [Sat, 29 May 2021 02:37:31 +0000  (10:37 +0800)] 
Merge pull request #41520 from tchaikov/wip-osd-unique-ptrReviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com> 
Kefu Chai  [Sat, 29 May 2021 02:36:43 +0000  (10:36 +0800)] 
Merge pull request #41573 from tchaikov/wip-allocat-ctorReviewed-by: Igor Fedotov <ifedotov@suse.com> 
Ilya Dryomov  [Wed, 26 May 2021 12:21:22 +0000  (14:21 +0200)] 
rbd: propagate Trash::purge() resultSigned-off-by: Ilya Dryomov <idryomov@gmail.com> 
Kefu Chai  [Fri, 28 May 2021 07:35:01 +0000  (15:35 +0800)] 
Merge pull request #41582 from cyx1231st/wip-seastore-swap-read-extentReviewed-by: Kefu Chai <kchai@redhat.com> 
Yingxin Cheng  [Thu, 27 May 2021 15:33:25 +0000  (23:33 +0800)] 
crimson/seastore: adopt get_mapping(t, offset) interfaceSigned-off-by: Yingxin Cheng <yingxin.cheng@intel.com> 
Yingxin Cheng  [Thu, 27 May 2021 08:48:47 +0000  (16:48 +0800)] 
crimson/seastore: implement and test get_mapping(t, laddr)Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com> 
Yingxin Cheng  [Thu, 27 May 2021 07:02:15 +0000  (15:02 +0800)] 
crimson/seastore: add stub to introduce get_mapping() without lengthSigned-off-by: Yingxin Cheng <yingxin.cheng@intel.com> 
Kefu Chai  [Fri, 28 May 2021 00:09:07 +0000  (08:09 +0800)] 
Merge pull request #41578 from rzarzynski/wip-crimson-monc-auth-reqReviewed-by: Kefu Chai <kchai@redhat.com> 
Kefu Chai  [Thu, 27 May 2021 23:59:34 +0000  (07:59 +0800)] 
Merge pull request #41544 from tchaikov/wip-doc-confvalReviewed-by: Neha Ojha <nojha@redhat.com> 
Kefu Chai  [Wed, 26 May 2021 04:00:57 +0000  (12:00 +0800)] 
doc/mgr: use confval directive to define optionsSigned-off-by: Kefu Chai <kchai@redhat.com> 
Yuri Weinstein  [Thu, 27 May 2021 23:40:41 +0000  (16:40 -0700)] 
Merge pull request #41540 from ceph/wip-15213Reviewed-by: Ilya Dryomov <idryomov@redhat.com> Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Casey Bodley <cbodley@redhat.com> Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com> Reviewed-by: Ramana Raja <rraja@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com> 
Sage Weil  [Thu, 27 May 2021 23:14:53 +0000  (19:14 -0400)] 
Merge PR #41483 into masterReviewed-by: Sebastian Wagner <swagner@suse.com> 
zdover23  [Thu, 27 May 2021 21:41:40 +0000  (07:41 +1000)] 
Merge pull request #41561 from zdover23/wip-doc-cephadm-s-mgmt-service-status-improvement-2021-05-26Reviewed-by: Sebastian Wagner <sewagner@redhat.com> 
Sage Weil  [Tue, 25 May 2021 17:55:08 +0000  (13:55 -0400)] 
cephadm: stop passing --no-hosts to podman
This reverts 
cfc1f914ce74f1fd1f45e2efd3ba2ddcb2da129a , which is no longer
neceesary because (1) we don't use socket.getfqdn(), and (2) we generally
do not rely on DNS or /etc/hosts at all anymore (with the exception of
the upgrade transition).
Signed-off-by: Sage Weil <sage@newdream.net> 
Sage Weil  [Wed, 26 May 2021 22:38:05 +0000  (18:38 -0400)] 
mgr/nfs: use host.addr for backend IP where possibleSigned-off-by: Sage Weil <sage@newdream.net> 
Sage Weil  [Tue, 25 May 2021 20:10:49 +0000  (16:10 -0400)] 
mgr/cephadm: convert host addr if non-IP to IPSigned-off-by: Sage Weil <sage@newdream.net> 
Sage Weil  [Tue, 25 May 2021 17:00:35 +0000  (13:00 -0400)] 
mgr/dashboard,prometheus: new method of getting mgr IPSigned-off-by: Sage Weil <sage@newdream.net> 
Sage Weil  [Tue, 25 May 2021 16:14:39 +0000  (12:14 -0400)] 
doc/cephadm: remove any reference to the use of DNS or /etc/hostsSigned-off-by: Sage Weil <sage@newdream.net> 
Sage Weil  [Fri, 21 May 2021 17:31:31 +0000  (13:31 -0400)] 
mgr/cephadm: use known host addrSigned-off-by: Sage Weil <sage@newdream.net> 
Radoslaw Zarzynski  [Thu, 27 May 2021 14:55:40 +0000  (14:55 +0000)] 
crimson/monc: handle_auth_request() doesn't depend on active_con.
Following crash occured at Sepia [1]:
```
INFO  2021-05-26 20:16:32,872 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] ProtocolV2::start_accept(): targ
et_addr=172.21.15.119:55220/0
DEBUG 2021-05-26 20:16:32,872 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] TRIGGER ACCEPTING, was NONE
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] SEND(26) banner: len_payload=16,
 supported=1, required=0, banner="ceph v2
"
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] RECV(10) banner: "ceph v2
"
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] GOT banner: payload_len=16
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] RECV(16) banner features: supported=1 required=0
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] WRITE HelloFrame: my_type=osd, peer_addr=172.21.15.119:55220/0
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] GOT HelloFrame: my_type=client peer_addr=v2:172.21.15.119:6803/31733
INFO  2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> client.? -@55220] UPDATE: peer_type=client, policy(lossy=true server=true standby=false resetcheck=false)
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> client.? -@55220] GOT AuthRequestFrame: method=2, preferred_modes={1, 2}, payload_len=174
/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/
17.0.0-4622-gaa1dc559 /rpm/el8/BUILD/
ceph-17.0.0-4622-gaa1dc559 /src/crimson/mon/MonClient.cc:399:10: runtime error: member access within null pointer of type 'struct Connection'
Segmentation fault on shard 0.
Backtrace:
 0# 0x000055E84CF44C1F in ceph-osd
 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
 2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
 3# 0x00007F2BC88C0B20 in /lib64/libpthread.so.0
 4# crimson::mon::Connection::get_conn() in ceph-osd
 5# crimson::mon::Client::handle_auth_request(seastar::shared_ptr<crimson::net::Connection>, seastar::lw_shared_ptr<AuthConnectionMeta>, bool, unsigned int, ceph::buffer::v15_2_0::list const&, ceph::buffer::v15_2_0::list*) in ceph-osd
 6# crimson::net::ProtocolV2::_handle_auth_request(ceph::buffer::v15_2_0::list&, bool) in ceph-osd
 7# 0x000055E84DF67669 in ceph-osd
 8# 0x000055E84DF68775 in ceph-osd
 9# 0x000055E846F47F60 in ceph-osd
10# 0x000055E85296770F in ceph-osd
11# 0x000055E85296CC50 in ceph-osd
12# 0x000055E852B1ECBB in ceph-osd
13# 0x000055E85267C73A in ceph-osd
14# main in ceph-osd
15# __libc_start_main in /lib64/libc.so.6
16# _start in ceph-osd
Fault at location: 0x98
```
[1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-26_12:20:26-rados-master-distro-basic-smithi/
6136907 
When the `handle_auth_request()` happens, there is no guarantee
`active_con` is being available. This is reflected in the classical
implementation:
```cpp
int MonClient::handle_auth_request(
  Connection *con,
  // ...
  ceph::buffer::list *reply)
{
  // ...
  bool isvalid = ah->verify_authorizer(
    cct,
    *rotating_secrets,
    payload,
    auth_meta->get_connection_secret_length(),
    reply,
    &con->peer_name,
    &con->peer_global_id,
    &con->peer_caps_info,
    &auth_meta->session_key,
    &auth_meta->connection_secret,
    ac);
```
The patch transplate the same logic to crimson.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com> 
Kefu Chai  [Thu, 27 May 2021 14:26:05 +0000  (22:26 +0800)] 
os/bluestore: pass string_view to ctor of AllocatorSigned-off-by: Kefu Chai <kchai@redhat.com> 
Kefu Chai  [Thu, 27 May 2021 15:14:36 +0000  (23:14 +0800)] 
tools/ceph_objectstore_tool: destruct ObjectStore using unique_ptr<>Signed-off-by: Kefu Chai <kchai@redhat.com> 
Kefu Chai  [Thu, 27 May 2021 03:08:48 +0000  (11:08 +0800)] 
osd: pass unique_ptr<ObjectStore> to ctor of OSDSigned-off-by: Kefu Chai <kchai@redhat.com>