kernel_optimize_test/net
Jesper Dangaard Brouer fd38d4e675 bpf: Remove MTU check in __bpf_skb_max_len
commit 6306c1189e77a513bf02720450bb43bd4ba5d8ae upstream.

Multiple BPF-helpers that can manipulate/increase the size of the SKB uses
__bpf_skb_max_len() as the max-length. This function limit size against
the current net_device MTU (skb->dev->mtu).

When a BPF-prog grow the packet size, then it should not be limited to the
MTU. The MTU is a transmit limitation, and software receiving this packet
should be allowed to increase the size. Further more, current MTU check in
__bpf_skb_max_len uses the MTU from ingress/current net_device, which in
case of redirects uses the wrong net_device.

This patch keeps a sanity max limit of SKB_MAX_ALLOC (16KiB). The real limit
is elsewhere in the system. Jesper's testing[1] showed it was not possible
to exceed 8KiB when expanding the SKB size via BPF-helper. The limiting
factor is the define KMALLOC_MAX_CACHE_SIZE which is 8192 for
SLUB-allocator (CONFIG_SLUB) in-case PAGE_SIZE is 4096. This define is
in-effect due to this being called from softirq context see code
__gfp_pfmemalloc_flags() and __do_kmalloc_node(). Jakub's testing showed
that frames above 16KiB can cause NICs to reset (but not crash). Keep this
sanity limit at this level as memory layer can differ based on kernel
config.

[1] https://github.com/xdp-project/bpf-examples/tree/master/MTU-tests

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/161287788936.790810.2937823995775097177.stgit@firesoul
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-07 15:00:08 +02:00
..
6lowpan
9p net: 9p: advance iov on empty read 2021-04-07 15:00:08 +02:00
802
8021q
appletalk appletalk: Fix skb allocation size in loopback case 2021-04-07 15:00:08 +02:00
atm
ax25
batman-adv
bluetooth Bluetooth: Fix null pointer dereference in amp_read_loc_assoc_final_data 2021-03-07 12:34:10 +01:00
bpf
bpfilter
bridge net: bridge: don't notify switchdev for local FDB addresses 2021-03-30 14:32:04 +02:00
caif
can net: introduce CAN specific pointer in the struct net_device 2021-04-07 15:00:07 +02:00
ceph
core bpf: Remove MTU check in __bpf_skb_max_len 2021-04-07 15:00:08 +02:00
dcb
dccp ipv6: weaken the v4mapped source check 2021-03-30 14:32:01 +02:00
decnet
dns_resolver
dsa net: dsa: tag_mtk: fix 802.1ad VLAN egress 2021-03-17 17:06:22 +01:00
ethernet
ethtool ethtool: fix the check logic of at least one channel for RX/TX 2021-03-17 17:06:16 +01:00
hsr net: hsr: add support for EntryForgetTime 2021-03-07 12:34:07 +01:00
ieee802154
ife
ipv4 Revert "netfilter: x_tables: Update remaining dereference to RCU" 2021-03-30 14:32:06 +02:00
ipv6 Revert "netfilter: x_tables: Update remaining dereference to RCU" 2021-03-30 14:32:06 +02:00
iucv net/af_iucv: remove WARN_ONCE on malformed RX packets 2021-03-07 12:34:05 +01:00
kcm
key
l2tp net: l2tp: reduce log level of messages in receive path, add counter instead 2021-03-17 17:06:11 +01:00
l3mdev
lapb
llc
mac80211 mac80211: fix double free in ibss_leave 2021-03-30 14:32:08 +02:00
mac802154
mpls net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0 2021-03-17 17:06:11 +01:00
mptcp ipv6: weaken the v4mapped source check 2021-03-30 14:32:01 +02:00
ncsi
netfilter netfilter: x_tables: Use correct memory barriers. 2021-03-30 14:32:06 +02:00
netlabel cipso,calipso: resolve a number of problems with the DOI refcounts 2021-03-17 17:06:15 +01:00
netlink
netrom
nfc
nsh
openvswitch
packet
phonet
psample net: psample: Fix netlink skb length with tunnel info 2021-03-07 12:34:07 +01:00
qrtr net: qrtr: fix a kernel-infoleak in qrtr_recvmsg() 2021-03-30 14:31:58 +02:00
rds
rfkill
rose
rxrpc
sched net/sched: cls_flower: fix only mask bit check in the validate_ct_state 2021-03-30 14:32:01 +02:00
sctp
smc
strparser
sunrpc rpc: fix NULL dereference on kmalloc failure 2021-04-07 15:00:04 +02:00
switchdev
tipc tipc: better validate user input in tipc_nl_retrieve_key() 2021-03-30 14:31:59 +02:00
tls
unix
vmw_vsock selinux: vsock: Set SID for socket returned by accept() 2021-03-30 14:32:03 +02:00
wimax
wireless
x25
xdp
xfrm
compat.c
devres.c
Kconfig
Makefile
socket.c
sysctl_net.c