A segmentation fault was observed in a recent test failure
(test_client_cache_size) [1]. An analysis of the core dump revealed that the
client->root inode was NULL. The client log [2] showed that the root inode was
deleted prior to the segfault:
2016-06-27 14:24:13.358500
7fe75cff9700 20 client.4161 _ll_put 0x7fe76000ba40
100000003e9 1 -> 0
2016-06-27 14:24:13.358502
7fe75cff9700 10 client.4161 put_inode on
100000003e9.head(faked_ino=0 ref=1 ll_ref=0 cap_refs={} open={} mode=41777 size=0/0 mtime=2016-06-27 14:24:10.700562 caps=pAsLsXsFs(0=pAsLsXsFs) 0x7fe76000ba40)
2016-06-27 14:24:13.358513
7fe75cff9700 15 inode.put on 0x7fe76000ba40
100000003e9.head now 0
2016-06-27 14:24:13.358515
7fe75cff9700 10 client.4161 remove_cap mds.0 on
100000003e9.head(faked_ino=0 ref=0 ll_ref=0 cap_refs={} open={} mode=41777 size=0/0 mtime=2016-06-27 14:24:10.700562 caps=pAsLsXsFs(0=pAsLsXsFs) 0x7fe76000ba40)
2016-06-27 14:24:13.358538
7fe75cff9700 15 client.4161 remove_cap last one, closing snaprealm 0x7fe76000c1b0
2016-06-27 14:24:13.358544
7fe75cff9700 20 client.4161 put_snap_realm 1 0x7fe76000c1b0 1002 -> 1001
2016-06-27 14:24:13.358546
7fe75cff9700 10 client.4161 put_inode deleting
100000003e9.head(faked_ino=0 ref=0 ll_ref=0 cap_refs={} open={} mode=41777 size=0/0 mtime=2016-06-27 14:24:10.700562 caps=- 0x7fe76000ba40)
After looking through the reference count inc/dec, I noticed this mismatched inc/dec:
2016-06-27 14:24:13.352622
7fe75dffb700 3 client.4161 ll_forget
100000003e9 1
2016-06-27 14:24:13.352601
7fe7767fc700 1 -- 172.21.15.58:0/
3762258427 >> 172.21.15.57:6804/11121 conn(0x7fe789bf3b60 sd=20 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=7 cs=1 l=0). == rx == mds.0 seq 1473 0x7fe764037a80 client_reply(???:470 = 0 (0) Success safe) v1
2016-06-27 14:24:13.352665
7fe75dffb700 20 client.4161 _ll_put 0x7fe76000ba40
100000003e9 1 -> 1
2016-06-27 14:24:13.352687
7fe75d7fa700 3 client.4161 ll_lookup 0x7fe76000ba40 testdir
2016-06-27 14:24:13.352699
7fe75d7fa700 20 client.4161 _lookup have dn testdir mds.-1 ttl 0.000000 seq 0
2016-06-27 14:24:13.352705
7fe75d7fa700 15 inode.get on 0x7fe760011da0
100000003ea.head now 537
2016-06-27 14:24:13.352693
7fe76effd700 1 -- 172.21.15.58:0/
3762258427 <== mds.0 172.21.15.57:6804/11121 1473 ==== client_reply(???:470 = 0 (0) Success safe) v1 ==== 27+0+0 (
3458149698 0 0) 0x7fe764037a80 con 0x7fe789bf3b60
2016-06-27 14:24:13.352708
7fe75d7fa700 10 client.4161 _lookup
100000003e9.head(faked_ino=0 ref=3 ll_ref=1 cap_refs={} open={} mode=41777 size=0/0 mtime=2016-06-27 14:24:10.700562 caps=pAsLsXsFs(0=pAsLsXsFs) 0x7fe76000ba40) testdir =
100000003ea.head(faked_ino=0 ref=537 ll_ref=999 cap_refs={} open={} mode=40755 size=0/0 mtime=2016-06-27 14:24:13.325297 caps=pAsLsXsFsx(0=pAsLsXsFsx) COMPLETE parents=0x7fe750001f50 0x7fe760011da0)
This sequence of ll_forget/ll_lookup is repeated in this test several hundred times. This is the prior sequence (in one thread):
2016-06-27 14:24:13.324896
7fe75dffb700 3 client.4161 ll_forget
100000003e9 1
2016-06-27 14:24:13.324904
7fe75dffb700 20 client.4161 _ll_put 0x7fe76000ba40
100000003e9 1 -> 1
2016-06-27 14:24:13.324915
7fe75dffb700 3 client.4161 ll_lookup 0x7fe76000ba40 testdir
2016-06-27 14:24:13.324921
7fe75dffb700 20 client.4161 _lookup have dn testdir mds.-1 ttl 0.000000 seq 0
2016-06-27 14:24:13.324931
7fe75dffb700 15 inode.get on 0x7fe760011da0
100000003ea.head now 621
2016-06-27 14:24:13.324904
7fe7767fc700 1 -- 172.21.15.58:0/
3762258427 >> 172.21.15.57:6804/11121 conn(0x7fe789bf3b60 sd=20 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=7 cs=1 l=0). == rx == mds.0 seq 1387 0x7fe764026720 client_reply(???:385 = 0 (0) Success safe) v1
2016-06-27 14:24:13.324934
7fe75dffb700 10 client.4161 _lookup
100000003e9.head(faked_ino=0 ref=3 ll_ref=2 cap_refs={} open={} mode=41777 size=0/0 mtime=2016-06-27 14:24:10.700562 caps=pAsLsXsFs(0=pAsLsXsFs) 0x7fe76000ba40) testdir =
100000003ea.head(faked_ino=0 ref=621 ll_ref=998 cap_refs={} open={} mode=40755 size=0/0 mtime=2016-06-27 14:24:13.321021 caps=pAsLsXsFsx(0=pAsLsXsFsx) COMPLETE parents=0x7fe750001f50 0x7fe760011da0)
The key line is `_lookup
100000003e9.head(...)`. ll_ref=2 is the expected count
after this sequence but, as we see at 14:24:13.352708, ll_ref=1.
This racing behavior is not serialized by client_lock because
Client::get_root() does not lock the client_lock. Additionally, this race is
not easy to identify in the logs because Client::get_root() does not use
Client::_ll_get to increment the reference, which hides the increments in the
logs. Instead it directly increments the reference using Inode:ll_get().
This commit adds the client_lock for Client::get_root() and
Client::get_root_ino() (which may not strictly be necessary but can't hurt).
[1] http://pulpito.ceph.com/pdonnell-2016-06-27_06:54:23-fs-wip-pdonnell-i16164---basic-smithi/280361/
[2] http://qa-proxy.ceph.com/teuthology/pdonnell-2016-06-27_06:54:23-fs-wip-pdonnell-i16164---basic-smithi/280361/remote/smithi058/log/ceph-client.0.23448.log.gz
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit
208e3b6fd58237b309aae15b18dcd82672b133af)