kernel_optimize_test/net/ipv4
Eric Dumazet 9709674e68 ipv4: fix a race in ip4_datagram_release_cb()
Alexey gave a AddressSanitizer[1] report that finally gave a good hint
at where was the origin of various problems already reported by Dormando
in the past [2]

Problem comes from the fact that UDP can have a lockless TX path, and
concurrent threads can manipulate sk_dst_cache, while another thread,
is holding socket lock and calls __sk_dst_set() in
ip4_datagram_release_cb() (this was added in linux-3.8)

It seems that all we need to do is to use sk_dst_check() and
sk_dst_set() so that all the writers hold same spinlock
(sk->sk_dst_lock) to prevent corruptions.

TCP stack do not need this protection, as all sk_dst_cache writers hold
the socket lock.

[1]
https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel

AddressSanitizer: heap-use-after-free in ipv4_dst_check
Read of size 2 by thread T15453:
 [<ffffffff817daa3a>] ipv4_dst_check+0x1a/0x90 ./net/ipv4/route.c:1116
 [<ffffffff8175b789>] __sk_dst_check+0x89/0xe0 ./net/core/sock.c:531
 [<ffffffff81830a36>] ip4_datagram_release_cb+0x46/0x390 ??:0
 [<ffffffff8175eaea>] release_sock+0x17a/0x230 ./net/core/sock.c:2413
 [<ffffffff81830882>] ip4_datagram_connect+0x462/0x5d0 ??:0
 [<ffffffff81846d06>] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534
 [<ffffffff817580ac>] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701
 [<ffffffff817596ce>] SyS_connect+0xe/0x10 ./net/socket.c:1682
 [<ffffffff818b0a29>] system_call_fastpath+0x16/0x1b
./arch/x86/kernel/entry_64.S:629

Freed by thread T15455:
 [<ffffffff8178d9b8>] dst_destroy+0xa8/0x160 ./net/core/dst.c:251
 [<ffffffff8178de25>] dst_release+0x45/0x80 ./net/core/dst.c:280
 [<ffffffff818304c1>] ip4_datagram_connect+0xa1/0x5d0 ??:0
 [<ffffffff81846d06>] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534
 [<ffffffff817580ac>] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701
 [<ffffffff817596ce>] SyS_connect+0xe/0x10 ./net/socket.c:1682
 [<ffffffff818b0a29>] system_call_fastpath+0x16/0x1b
./arch/x86/kernel/entry_64.S:629

Allocated by thread T15453:
 [<ffffffff8178d291>] dst_alloc+0x81/0x2b0 ./net/core/dst.c:171
 [<ffffffff817db3b7>] rt_dst_alloc+0x47/0x50 ./net/ipv4/route.c:1406
 [<     inlined    >] __ip_route_output_key+0x3e8/0xf70
__mkroute_output ./net/ipv4/route.c:1939
 [<ffffffff817dde08>] __ip_route_output_key+0x3e8/0xf70 ./net/ipv4/route.c:2161
 [<ffffffff817deb34>] ip_route_output_flow+0x14/0x30 ./net/ipv4/route.c:2249
 [<ffffffff81830737>] ip4_datagram_connect+0x317/0x5d0 ??:0
 [<ffffffff81846d06>] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534
 [<ffffffff817580ac>] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701
 [<ffffffff817596ce>] SyS_connect+0xe/0x10 ./net/socket.c:1682
 [<ffffffff818b0a29>] system_call_fastpath+0x16/0x1b
./arch/x86/kernel/entry_64.S:629

[2]
<4>[196727.311203] general protection fault: 0000 [#1] SMP
<4>[196727.311224] Modules linked in: xt_TEE xt_dscp xt_DSCP macvlan bridge coretemp crc32_pclmul ghash_clmulni_intel gpio_ich microcode ipmi_watchdog ipmi_devintf sb_edac edac_core lpc_ich mfd_core tpm_tis tpm tpm_bios ipmi_si ipmi_msghandler isci igb libsas i2c_algo_bit ixgbe ptp pps_core mdio
<4>[196727.311333] CPU: 17 PID: 0 Comm: swapper/17 Not tainted 3.10.26 #1
<4>[196727.311344] Hardware name: Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.0 07/05/2013
<4>[196727.311364] task: ffff885e6f069700 ti: ffff885e6f072000 task.ti: ffff885e6f072000
<4>[196727.311377] RIP: 0010:[<ffffffff815f8c7f>]  [<ffffffff815f8c7f>] ipv4_dst_destroy+0x4f/0x80
<4>[196727.311399] RSP: 0018:ffff885effd23a70  EFLAGS: 00010282
<4>[196727.311409] RAX: dead000000200200 RBX: ffff8854c398ecc0 RCX: 0000000000000040
<4>[196727.311423] RDX: dead000000100100 RSI: dead000000100100 RDI: dead000000200200
<4>[196727.311437] RBP: ffff885effd23a80 R08: ffffffff815fd9e0 R09: ffff885d5a590800
<4>[196727.311451] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
<4>[196727.311464] R13: ffffffff81c8c280 R14: 0000000000000000 R15: ffff880e85ee16ce
<4>[196727.311510] FS:  0000000000000000(0000) GS:ffff885effd20000(0000) knlGS:0000000000000000
<4>[196727.311554] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[196727.311581] CR2: 00007a46751eb000 CR3: 0000005e65688000 CR4: 00000000000407e0
<4>[196727.311625] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[196727.311669] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
<4>[196727.311713] Stack:
<4>[196727.311733]  ffff8854c398ecc0 ffff8854c398ecc0 ffff885effd23ab0 ffffffff815b7f42
<4>[196727.311784]  ffff88be6595bc00 ffff8854c398ecc0 0000000000000000 ffff8854c398ecc0
<4>[196727.311834]  ffff885effd23ad0 ffffffff815b86c6 ffff885d5a590800 ffff8816827821c0
<4>[196727.311885] Call Trace:
<4>[196727.311907]  <IRQ>
<4>[196727.311912]  [<ffffffff815b7f42>] dst_destroy+0x32/0xe0
<4>[196727.311959]  [<ffffffff815b86c6>] dst_release+0x56/0x80
<4>[196727.311986]  [<ffffffff81620bd5>] tcp_v4_do_rcv+0x2a5/0x4a0
<4>[196727.312013]  [<ffffffff81622b5a>] tcp_v4_rcv+0x7da/0x820
<4>[196727.312041]  [<ffffffff815fd9e0>] ? ip_rcv_finish+0x360/0x360
<4>[196727.312070]  [<ffffffff815de02d>] ? nf_hook_slow+0x7d/0x150
<4>[196727.312097]  [<ffffffff815fd9e0>] ? ip_rcv_finish+0x360/0x360
<4>[196727.312125]  [<ffffffff815fda92>] ip_local_deliver_finish+0xb2/0x230
<4>[196727.312154]  [<ffffffff815fdd9a>] ip_local_deliver+0x4a/0x90
<4>[196727.312183]  [<ffffffff815fd799>] ip_rcv_finish+0x119/0x360
<4>[196727.312212]  [<ffffffff815fe00b>] ip_rcv+0x22b/0x340
<4>[196727.312242]  [<ffffffffa0339680>] ? macvlan_broadcast+0x160/0x160 [macvlan]
<4>[196727.312275]  [<ffffffff815b0c62>] __netif_receive_skb_core+0x512/0x640
<4>[196727.312308]  [<ffffffff811427fb>] ? kmem_cache_alloc+0x13b/0x150
<4>[196727.312338]  [<ffffffff815b0db1>] __netif_receive_skb+0x21/0x70
<4>[196727.312368]  [<ffffffff815b0fa1>] netif_receive_skb+0x31/0xa0
<4>[196727.312397]  [<ffffffff815b1ae8>] napi_gro_receive+0xe8/0x140
<4>[196727.312433]  [<ffffffffa00274f1>] ixgbe_poll+0x551/0x11f0 [ixgbe]
<4>[196727.312463]  [<ffffffff815fe00b>] ? ip_rcv+0x22b/0x340
<4>[196727.312491]  [<ffffffff815b1691>] net_rx_action+0x111/0x210
<4>[196727.312521]  [<ffffffff815b0db1>] ? __netif_receive_skb+0x21/0x70
<4>[196727.312552]  [<ffffffff810519d0>] __do_softirq+0xd0/0x270
<4>[196727.312583]  [<ffffffff816cef3c>] call_softirq+0x1c/0x30
<4>[196727.312613]  [<ffffffff81004205>] do_softirq+0x55/0x90
<4>[196727.312640]  [<ffffffff81051c85>] irq_exit+0x55/0x60
<4>[196727.312668]  [<ffffffff816cf5c3>] do_IRQ+0x63/0xe0
<4>[196727.312696]  [<ffffffff816c5aaa>] common_interrupt+0x6a/0x6a
<4>[196727.312722]  <EOI>
<1>[196727.313071] RIP  [<ffffffff815f8c7f>] ipv4_dst_destroy+0x4f/0x80
<4>[196727.313100]  RSP <ffff885effd23a70>
<4>[196727.313377] ---[ end trace 64b3f14fae0f2e29 ]---
<0>[196727.380908] Kernel panic - not syncing: Fatal exception in interrupt

Reported-by: Alexey Preobrazhensky <preobr@google.com>
Reported-by: dormando <dormando@rydia.ne>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: 8141ed9fce ("ipv4: Add a socket release callback for datagram sockets")
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 15:39:18 -07:00
..
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2014-05-30 17:54:47 -07:00
af_inet.c gre: Call gso_make_checksum 2014-06-04 22:46:38 -07:00
ah4.c ah4: Use the IPsec protocol multiplexer API 2014-02-25 07:04:17 +01:00
arp.c
cipso_ipv4.c
datagram.c ipv4: fix a race in ip4_datagram_release_cb() 2014-06-11 15:39:18 -07:00
devinet.c ipv4: minor spelling fix 2014-05-18 21:10:29 -04:00
esp4.c esp4: Use the IPsec protocol multiplexer API 2014-02-25 07:04:17 +01:00
fib_frontend.c ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif 2014-04-16 15:05:11 -04:00
fib_lookup.h
fib_rules.c
fib_semantics.c ipv4: fib_semantics: increment fib_info_cnt after fib_info allocation 2014-05-07 17:14:32 -04:00
fib_trie.c net: Revert "fib_trie: use seq_file_net rather than seq->private" 2014-06-04 15:11:41 -07:00
gre_demux.c gre: Call gso_make_checksum 2014-06-04 22:46:38 -07:00
gre_offload.c gre: Call gso_make_checksum 2014-06-04 22:46:38 -07:00
icmp.c net: add a sysctl to reflect the fwmark on replies 2014-05-13 18:35:08 -04:00
igmp.c inetpeer: get rid of ip_id_count 2014-06-02 11:00:41 -07:00
inet_connection_sock.c ipv4: make ip_local_reserved_ports per netns 2014-05-14 15:31:45 -04:00
inet_diag.c
inet_fragment.c inet: frag: make sure forced eviction removes all frags 2014-03-06 15:28:45 -05:00
inet_hashtables.c net: Use a more standard macro for INET_ADDR_COOKIE 2014-05-14 16:07:23 -04:00
inet_lro.c
inet_timewait_sock.c
inetpeer.c inetpeer: get rid of ip_id_count 2014-06-02 11:00:41 -07:00
ip_forward.c net: rename local_df to ignore_df 2014-05-12 14:03:41 -04:00
ip_fragment.c ipv4: fix "conntrack zones" support for defrag user check in ip_expire 2014-05-05 16:02:59 +02:00
ip_gre.c gre: allow changing mac address when device is up 2014-06-10 22:46:42 -07:00
ip_input.c
ip_options.c ipv4: Use predefined value for readability 2014-04-28 13:28:43 -04:00
ip_output.c inetpeer: get rid of ip_id_count 2014-06-02 11:00:41 -07:00
ip_sockglue.c ipv4: yet another new IP_MTU_DISCOVER option IP_PMTUDISC_OMIT 2014-02-26 15:51:00 -05:00
ip_tunnel_core.c net: Support for multiple checksums with gso 2014-06-04 22:46:38 -07:00
ip_tunnel.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-05-24 00:32:30 -04:00
ip_vti.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec 2014-05-15 23:23:48 -04:00
ipcomp.c ipcomp4: Use the IPsec protocol multiplexer API 2014-02-25 07:04:17 +01:00
ipconfig.c ipv4: ipconfig.c: add parentheses in an if statement 2014-02-14 00:14:23 -05:00
ipip.c net: tunnels - enable module autoloading 2014-05-21 15:46:52 -04:00
ipmr.c inetpeer: get rid of ip_id_count 2014-06-02 11:00:41 -07:00
Kconfig
Makefile xfrm4: Add IPsec protocol multiplexer 2014-02-25 07:04:16 +01:00
netfilter.c netfilter: remove double colon 2014-02-19 11:41:25 +01:00
ping.c ping: move ping_group_range out of CONFIG_SYSCTL 2014-05-08 22:50:47 -04:00
proc.c net: clean up snmp stats code 2014-05-07 16:06:05 -04:00
protocol.c
raw.c inetpeer: get rid of ip_id_count 2014-06-02 11:00:41 -07:00
route.c inetpeer: get rid of ip_id_count 2014-06-02 11:00:41 -07:00
syncookies.c net: support marking accepting TCP sockets 2014-05-13 18:35:09 -04:00
sysctl_net_ipv4.c ipv4: make ip_local_reserved_ports per netns 2014-05-14 15:31:45 -04:00
tcp_bic.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_cong.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_cubic.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-05-12 13:19:14 -04:00
tcp_diag.c
tcp_fastopen.c tcp: IPv6 support for fastopen server 2014-05-13 17:53:03 -04:00
tcp_highspeed.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_htcp.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_hybla.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_illinois.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_input.c tcp: add gfp parameter to tcp_fragment 2014-06-10 22:30:58 -07:00
tcp_ipv4.c net: support marking accepting TCP sockets 2014-05-13 18:35:09 -04:00
tcp_lp.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_memcontrol.c cgroup: drop const from @buffer of cftype->write_string() 2014-03-19 10:23:54 -04:00
tcp_metrics.c net: use the new API kvfree() 2014-06-05 00:49:51 -07:00
tcp_minisocks.c tcp: use tcp_v4_send_synack on first SYN-ACK 2014-05-13 17:53:02 -04:00
tcp_offload.c gre: Call gso_make_checksum 2014-06-04 22:46:38 -07:00
tcp_output.c tcp: add gfp parameter to tcp_fragment 2014-06-10 22:30:58 -07:00
tcp_probe.c tcp: switch rtt estimations to usec resolution 2014-02-26 17:08:40 -05:00
tcp_scalable.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_timer.c tcp: snmp stats for Fast Open, SYN rtx, and data pkts 2014-03-03 15:58:03 -05:00
tcp_vegas.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_vegas.h
tcp_veno.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_westwood.c tcp: remove unused min_cwnd member of tcp_congestion_ops 2014-02-13 18:22:34 -05:00
tcp_yeah.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp.c tcp: Add a TCP_FASTOPEN socket option to get a max backlog on its listner 2014-04-20 18:18:54 -04:00
tunnel4.c
udp_diag.c
udp_impl.h
udp_offload.c gre: Call gso_make_checksum 2014-06-04 22:46:38 -07:00
udp.c ipv4: use skb frags api in udp4_hwcsum() 2014-06-05 00:51:47 -07:00
udplite.c net: Eliminate no_check from protosw 2014-05-23 16:28:53 -04:00
xfrm4_input.c xfrm4: Add IPsec protocol multiplexer 2014-02-25 07:04:16 +01:00
xfrm4_mode_beet.c
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c inetpeer: get rid of ip_id_count 2014-06-02 11:00:41 -07:00
xfrm4_output.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-05-24 00:32:30 -04:00
xfrm4_policy.c xfrm: Introduce xfrm_input_afinfo to access the the callbacks properly 2014-03-14 07:28:07 +01:00
xfrm4_protocol.c xfrm4: Properly handle unsupported protocols 2014-04-29 08:41:12 +02:00
xfrm4_state.c
xfrm4_tunnel.c