kernel_optimize_test/kernel/bpf
John Fastabend a9f36bf361 bpf: Track subprog poke descriptors correctly and fix use-after-free
commit f263a81451c12da5a342d90572e317e611846f2c upstream.

Subprograms are calling map_poke_track(), but on program release there is no
hook to call map_poke_untrack(). However, on program release, the aux memory
(and poke descriptor table) is freed even though we still have a reference to
it in the element list of the map aux data. When we run map_poke_run(), we then
end up accessing free'd memory, triggering KASAN in prog_array_map_poke_run():

  [...]
  [  402.824689] BUG: KASAN: use-after-free in prog_array_map_poke_run+0xc2/0x34e
  [  402.824698] Read of size 4 at addr ffff8881905a7940 by task hubble-fgs/4337
  [  402.824705] CPU: 1 PID: 4337 Comm: hubble-fgs Tainted: G          I       5.12.0+ #399
  [  402.824715] Call Trace:
  [  402.824719]  dump_stack+0x93/0xc2
  [  402.824727]  print_address_description.constprop.0+0x1a/0x140
  [  402.824736]  ? prog_array_map_poke_run+0xc2/0x34e
  [  402.824740]  ? prog_array_map_poke_run+0xc2/0x34e
  [  402.824744]  kasan_report.cold+0x7c/0xd8
  [  402.824752]  ? prog_array_map_poke_run+0xc2/0x34e
  [  402.824757]  prog_array_map_poke_run+0xc2/0x34e
  [  402.824765]  bpf_fd_array_map_update_elem+0x124/0x1a0
  [...]

The elements concerned are walked as follows:

    for (i = 0; i < elem->aux->size_poke_tab; i++) {
           poke = &elem->aux->poke_tab[i];
    [...]

The access to size_poke_tab is a 4 byte read, verified by checking offsets
in the KASAN dump:

  [  402.825004] The buggy address belongs to the object at ffff8881905a7800
                 which belongs to the cache kmalloc-1k of size 1024
  [  402.825008] The buggy address is located 320 bytes inside of
                 1024-byte region [ffff8881905a7800, ffff8881905a7c00)

The pahole output of bpf_prog_aux:

  struct bpf_prog_aux {
    [...]
    /* --- cacheline 5 boundary (320 bytes) --- */
    u32                        size_poke_tab;        /*   320     4 */
    [...]

In general, subprograms do not necessarily manage their own data structures.
For example, BTF func_info and linfo are just pointers to the main program
structure. This allows reference counting and cleanup to be done on the latter
which simplifies their management a bit. The aux->poke_tab struct, however,
did not follow this logic. The initial proposed fix for this use-after-free
bug further embedded poke data tracking into the subprogram with proper
reference counting. However, Daniel and Alexei questioned why we were treating
these objects special; I agree, its unnecessary. The fix here removes the per
subprogram poke table allocation and map tracking and instead simply points
the aux->poke_tab pointer at the main programs poke table. This way, map
tracking is simplified to the main program and we do not need to manage them
per subprogram.

This also means, bpf_prog_free_deferred(), which unwinds the program reference
counting and kfrees objects, needs to ensure that we don't try to double free
the poke_tab when free'ing the subprog structures. This is easily solved by
NULL'ing the poke_tab pointer. The second detail is to ensure that per
subprogram JIT logic only does fixups on poke_tab[] entries it owns. To do
this, we add a pointer in the poke structure to point at the subprogram value
so JITs can easily check while walking the poke_tab structure if the current
entry belongs to the current program. The aux pointer is stable and therefore
suitable for such comparison. On the jit_subprogs() error path, we omit
cleaning up the poke->aux field because these are only ever referenced from
the JIT side, but on error we will never make it to the JIT, so its fine to
leave them dangling. Removing these pointers would complicate the error path
for no reason. However, we do need to untrack all poke descriptors from the
main program as otherwise they could race with the freeing of JIT memory from
the subprograms. Lastly, a748c6975d ("bpf: propagate poke descriptors to
subprograms") had an off-by-one on the subprogram instruction index range
check as it was testing 'insn_idx >= subprog_start && insn_idx <= subprog_end'.
However, subprog_end is the next subprogram's start instruction.

Fixes: a748c6975d ("bpf: propagate poke descriptors to subprograms")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210707223848.14580-2-john.fastabend@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-25 14:36:21 +02:00
..
preload bpf: Fix umd memory leak in copy_process() 2021-03-30 14:32:03 +02:00
arraymap.c bpf: Allow for map-in-map with dynamic inner array map entries 2020-10-11 10:21:04 -07:00
bpf_inode_storage.c bpf: Change inode_storage's lookup_elem return value from NULL to -EBADF 2021-03-30 14:31:56 +02:00
bpf_iter.c bpf: Fix an unitialized value in bpf_iter 2021-03-04 11:37:33 +01:00
bpf_local_storage.c bpf: Use hlist_add_head_rcu when linking to local_storage 2020-09-19 01:12:35 +02:00
bpf_lru_list.c bpf_lru_list: Read double-checked variable once without lock 2021-03-04 11:37:29 +01:00
bpf_lru_list.h bpf: Fix a typo "inacitve" -> "inactive" 2020-04-06 21:54:10 +02:00
bpf_lsm.c bpf: Update verification logic for LSM programs 2020-11-06 13:15:21 -08:00
bpf_struct_ops_types.h bpf: tcp: Support tcp_congestion_ops in bpf 2020-01-09 08:46:18 -08:00
bpf_struct_ops.c bpf: Fix fexit trampoline. 2021-04-07 15:00:03 +02:00
btf.c bpf: Forbid trampoline attach for functions with variable arguments 2021-06-16 12:01:35 +02:00
cgroup.c bpf, cgroup: Fix problematic bounds check 2021-02-10 09:29:12 +01:00
core.c bpf: Track subprog poke descriptors correctly and fix use-after-free 2021-07-25 14:36:21 +02:00
cpumap.c bpf, cpumap: Remove rcpu pointer from cpu_map_build_skb signature 2020-09-28 23:30:42 +02:00
devmap.c bpf, devmap: Use GFP_KERNEL for xdp bulk queue allocation 2021-03-04 11:37:33 +01:00
disasm.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 295 2019-06-05 17:36:38 +02:00
disasm.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 295 2019-06-05 17:36:38 +02:00
dispatcher.c bpf: Remove bpf_image tree 2020-03-13 12:49:52 -07:00
hashtab.c bpf: Zero-fill re-used per-cpu map element 2020-11-05 19:55:57 -08:00
helpers.c bpf, lockdown, audit: Fix buggy SELinux lockdown permission checks 2021-06-10 13:39:19 +02:00
inode.c bpf: link: Refuse non-O_RDWR flags in BPF_OBJ_GET 2021-04-14 08:42:00 +02:00
local_storage.c bpf/local_storage: Fix build without CONFIG_CGROUP 2020-07-25 20:16:36 -07:00
lpm_trie.c bpf: Add map_meta_equal map ops 2020-08-28 15:41:30 +02:00
Makefile bpf: Don't rely on GCC __attribute__((optimize)) to disable GCSE 2020-10-29 20:01:46 -07:00
map_in_map.c bpf: Relax max_entries check for most of the inner map types 2020-08-28 15:41:30 +02:00
map_in_map.h bpf: Add map_meta_equal map ops 2020-08-28 15:41:30 +02:00
map_iter.c bpf: Implement link_query callbacks in map element iterators 2020-08-21 14:01:39 -07:00
net_namespace.c bpf: Add support for forced LINK_DETACH command 2020-08-01 20:38:28 -07:00
offload.c bpf, offload: Replace bitwise AND by logical AND in bpf_prog_offload_info_fill 2020-02-17 16:53:49 +01:00
percpu_freelist.c bpf: Use raw_spin_trylock() for pcpu_freelist_push/pop in NMI 2020-10-06 00:04:11 +02:00
percpu_freelist.h bpf: Use raw_spin_trylock() for pcpu_freelist_push/pop in NMI 2020-10-06 00:04:11 +02:00
prog_iter.c bpf: Refactor bpf_iter_reg to have separate seq_info member 2020-07-25 20:16:32 -07:00
queue_stack_maps.c bpf: Add map_meta_equal map ops 2020-08-28 15:41:30 +02:00
reuseport_array.c bpf, net: Rework cookie generator as per-cpu one 2020-09-30 11:50:35 -07:00
ringbuf.c bpf: Fix false positive kmemleak report in bpf_ringbuf_area_alloc() 2021-07-19 09:44:54 +02:00
stackmap.c bpf: Refcount task stack in bpf_get_task_stack 2021-04-14 08:42:01 +02:00
syscall.c bpf: Prevent double bpf_prog_put call from bpf_tracing_prog_attach 2021-01-27 11:55:07 +01:00
sysfs_btf.c bpf: Fix sysfs export of empty BTF section 2020-09-21 21:50:24 +02:00
task_iter.c bpf: Save correct stopping point in file seq iteration 2021-01-19 18:27:28 +01:00
tnum.c bpf: Verifier, do explicit ALU32 bounds tracking 2020-03-30 14:59:53 -07:00
trampoline.c bpf: Fix fexit trampoline. 2021-04-07 15:00:03 +02:00
verifier.c bpf: Track subprog poke descriptors correctly and fix use-after-free 2021-07-25 14:36:21 +02:00