Commit Graph

538 Commits

Author SHA1 Message Date
John Fastabend
c20a71a7a3 bpf: sockmap remove unused function
This was added for some work that was eventually factored out but the
helper call was missed. Remove it now and add it back later if needed.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-04 19:01:14 +01:00
Jakub Kicinski
675fc275a3 bpf: offload: report device information for offloaded programs
Report to the user ifindex and namespace information of offloaded
programs.  If device has disappeared return -ENODEV.  Specify the
namespace using dev/inode combination.

CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-31 16:12:23 +01:00
Jakub Kicinski
ad8ad79f4f bpf: offload: free program id when device disappears
Bound programs are quite useless after their device disappears.
They are simply waiting for reference count to go to zero,
don't list them in BPF_PROG_GET_NEXT_ID by freeing their ID
early.

Note that orphaned offload programs will return -ENODEV on
BPF_OBJ_GET_INFO_BY_FD so user will never see ID 0.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-31 16:12:23 +01:00
Jakub Kicinski
ce3b9db4db bpf: offload: free prog->aux->offload when device disappears
All bpf offload operations should now be under bpf_devs_lock,
it's safe to free and clear the entire offload structure,
not only the netdev pointer.

__bpf_prog_offload_destroy() will no longer be called multiple
times.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-31 16:12:23 +01:00
Jakub Kicinski
cae1927c0b bpf: offload: allow netdev to disappear while verifier is running
To allow verifier instruction callbacks without any extra locking
NETDEV_UNREGISTER notification would wait on a waitqueue for verifier
to finish.  This design decision was made when rtnl lock was providing
all the locking.  Use the read/write lock instead and remove the
workqueue.

Verifier will now call into the offload code, so dev_ops are moved
to offload structure.  Since verifier calls are all under
bpf_prog_is_dev_bound() we no longer need static inline implementations
to please builds with CONFIG_NET=n.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-31 16:12:23 +01:00
Jakub Kicinski
9a18eedb14 bpf: offload: don't use prog->aux->offload as boolean
We currently use aux->offload to indicate that program is bound
to a specific device.  This forces us to keep the offload structure
around even after the device is gone.  Add a bool member to
struct bpf_prog_aux to indicate if offload was requested.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-31 16:12:22 +01:00
Jakub Kicinski
e0d3974ac7 bpf: offload: don't require rtnl for dev list manipulation
We don't need the RTNL lock for all operations on offload state.
We only need to hold it around ndo calls.  The device offload
initialization doesn't require it.  The soon-to-come querying
of the offload info will only need it partially.  We will also
be able to remove the waitqueue in following patches.

Use struct rw_semaphore because map offload will require sleeping
with the semaphore held for read.

Suggested-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-31 16:12:22 +01:00
David S. Miller
fcffe2edbd Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2017-12-28

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Fix incorrect state pruning related to recognition of zero initialized
   stack slots, where stacksafe exploration would mistakenly return a
   positive pruning verdict too early ignoring other slots, from Gianluca.

2) Various BPF to BPF calls related follow-up fixes. Fix an off-by-one
   in maximum call depth check, and rework maximum stack depth tracking
   logic to fix a bypass of the total stack size check reported by Jann.
   Also fix a bug in arm64 JIT where prog->jited_len was uninitialized.
   Addition of various test cases to BPF selftests, from Alexei.

3) Addition of a BPF selftest to test_verifier that is related to BPF to
   BPF calls which demonstrates a late caller stack size increase and
   thus out of bounds access. Fixed above in 2). Test case from Jann.

4) Addition of correlating BPF helper calls, BPF to BPF calls as well
   as BPF maps to bpftool xlated dump in order to allow for better
   BPF program introspection and debugging, from Daniel.

5) Fixing several bugs in BPF to BPF calls kallsyms handling in order
   to get it actually to work for subprogs, from Daniel.

6) Extending sparc64 JIT support for BPF to BPF calls and fix a couple
   of build errors for libbpf on sparc64, from David.

7) Allow narrower context access for BPF dev cgroup typed programs in
   order to adapt to LLVM code generation. Also adjust memlock rlimit
   in the test_dev_cgroup BPF selftest, from Yonghong.

8) Add netdevsim Kconfig entry to BPF selftests since test_offload.py
   relies on netdevsim device being available, from Jakub.

9) Reduce scope of xdp_do_generic_redirect_map() to being static,
   from Xiongwei.

10) Minor cleanups and spelling fixes in BPF verifier, from Colin.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-27 20:40:32 -05:00
Alexei Starovoitov
aada9ce644 bpf: fix max call depth check
fix off by one error in max call depth check
and add a test

Fixes: f4d7e40a5b ("bpf: introduce function calls (verification)")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-27 18:36:23 +01:00
Alexei Starovoitov
70a87ffea8 bpf: fix maximum stack depth tracking logic
Instead of computing max stack depth for current call chain
during the main verifier pass track stack depth of each
function independently and after do_check() is done do
another pass over all instructions analyzing depth
of all possible call stacks.

Fixes: f4d7e40a5b ("bpf: introduce function calls (verification)")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-27 18:36:23 +01:00
Gianluca Borello
fd05e57bb3 bpf: fix stacksafe exploration when comparing states
Commit cc2b14d510 ("bpf: teach verifier to recognize zero initialized
stack") introduced a very relaxed check when comparing stacks of different
states, effectively returning a positive result in many cases where it
shouldn't.

This can create problems in cases such as this following C pseudocode:

long var;
long *x = bpf_map_lookup(...);
if (!x)
        return;

if (*x != 0xbeef)
        var = 0;
else
        var = 1;

/* This is the key part, calling a helper causes an explored state
 * to be saved with the information that "var" is on the stack as
 * STACK_ZERO, since the helper is first met by the verifier after
 * the "var = 0" assignment. This state will however be wrongly used
 * also for the "var = 1" case, so the verifier assumes "var" is always
 * 0 and will replace the NULL assignment with nops, because the
 * search pruning prevents it from exploring the faulty branch.
 */
bpf_ktime_get_ns();

if (var)
        *(long *)0 = 0xbeef;

Fix the issue by making sure that the stack is fully explored before
returning a positive comparison result.

Also attach a couple tests that highlight the bad behavior. In the first
test, without this fix instructions 16 and 17 are replaced with nops
instead of being rejected by the verifier.

The second test, instead, allows a program to make a potentially illegal
read from the stack.

Fixes: cc2b14d510 ("bpf: teach verifier to recognize zero initialized stack")
Signed-off-by: Gianluca Borello <g.borello@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-23 11:04:58 -08:00
David S. Miller
fba961ab29 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Lots of overlapping changes.  Also on the net-next side
the XDP state management is handled more in the generic
layers so undo the 'net' nfp fix which isn't applicable
in net-next.

Include a necessary change by Jakub Kicinski, with log message:

====================
cls_bpf no longer takes care of offload tracking.  Make sure
netdevsim performs necessary checks.  This fixes a warning
caused by TC trying to remove a filter it has not added.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-22 11:16:31 -05:00
Daniel Borkmann
7105e828c0 bpf: allow for correlation of maps and helpers in dump
Currently a dump of an xlated prog (post verifier stage) doesn't
correlate used helpers as well as maps. The prog info lists
involved map ids, however there's no correlation of where in the
program they are used as of today. Likewise, bpftool does not
correlate helper calls with the target functions.

The latter can be done w/o any kernel changes through kallsyms,
and also has the advantage that this works with inlined helpers
and BPF calls.

Example, via interpreter:

  # tc filter show dev foo ingress
  filter protocol all pref 49152 bpf chain 0
  filter protocol all pref 49152 bpf chain 0 handle 0x1 foo.o:[ingress] \
                      direct-action not_in_hw id 1 tag c74773051b364165   <-- prog id:1

  * Output before patch (calls/maps remain unclear):

  # bpftool prog dump xlated id 1             <-- dump prog id:1
   0: (b7) r1 = 2
   1: (63) *(u32 *)(r10 -4) = r1
   2: (bf) r2 = r10
   3: (07) r2 += -4
   4: (18) r1 = 0xffff95c47a8d4800
   6: (85) call unknown#73040
   7: (15) if r0 == 0x0 goto pc+18
   8: (bf) r2 = r10
   9: (07) r2 += -4
  10: (bf) r1 = r0
  11: (85) call unknown#73040
  12: (15) if r0 == 0x0 goto pc+23
  [...]

  * Output after patch:

  # bpftool prog dump xlated id 1
   0: (b7) r1 = 2
   1: (63) *(u32 *)(r10 -4) = r1
   2: (bf) r2 = r10
   3: (07) r2 += -4
   4: (18) r1 = map[id:2]                     <-- map id:2
   6: (85) call bpf_map_lookup_elem#73424     <-- helper call
   7: (15) if r0 == 0x0 goto pc+18
   8: (bf) r2 = r10
   9: (07) r2 += -4
  10: (bf) r1 = r0
  11: (85) call bpf_map_lookup_elem#73424
  12: (15) if r0 == 0x0 goto pc+23
  [...]

  # bpftool map show id 2                     <-- show/dump/etc map id:2
  2: hash_of_maps  flags 0x0
        key 4B  value 4B  max_entries 3  memlock 4096B

Example, JITed, same prog:

  # tc filter show dev foo ingress
  filter protocol all pref 49152 bpf chain 0
  filter protocol all pref 49152 bpf chain 0 handle 0x1 foo.o:[ingress] \
                  direct-action not_in_hw id 3 tag c74773051b364165 jited

  # bpftool prog show id 3
  3: sched_cls  tag c74773051b364165
        loaded_at Dec 19/13:48  uid 0
        xlated 384B  jited 257B  memlock 4096B  map_ids 2

  # bpftool prog dump xlated id 3
   0: (b7) r1 = 2
   1: (63) *(u32 *)(r10 -4) = r1
   2: (bf) r2 = r10
   3: (07) r2 += -4
   4: (18) r1 = map[id:2]                      <-- map id:2
   6: (85) call __htab_map_lookup_elem#77408   <-+ inlined rewrite
   7: (15) if r0 == 0x0 goto pc+2                |
   8: (07) r0 += 56                              |
   9: (79) r0 = *(u64 *)(r0 +0)                <-+
  10: (15) if r0 == 0x0 goto pc+24
  11: (bf) r2 = r10
  12: (07) r2 += -4
  [...]

Example, same prog, but kallsyms disabled (in that case we are
also not allowed to pass any relative offsets, etc, so prog
becomes pointer sanitized on dump):

  # sysctl kernel.kptr_restrict=2
  kernel.kptr_restrict = 2

  # bpftool prog dump xlated id 3
   0: (b7) r1 = 2
   1: (63) *(u32 *)(r10 -4) = r1
   2: (bf) r2 = r10
   3: (07) r2 += -4
   4: (18) r1 = map[id:2]
   6: (85) call bpf_unspec#0
   7: (15) if r0 == 0x0 goto pc+2
  [...]

Example, BPF calls via interpreter:

  # bpftool prog dump xlated id 1
   0: (85) call pc+2#__bpf_prog_run_args32
   1: (b7) r0 = 1
   2: (95) exit
   3: (b7) r0 = 2
   4: (95) exit

Example, BPF calls via JIT:

  # sysctl net.core.bpf_jit_enable=1
  net.core.bpf_jit_enable = 1
  # sysctl net.core.bpf_jit_kallsyms=1
  net.core.bpf_jit_kallsyms = 1

  # bpftool prog dump xlated id 1
   0: (85) call pc+2#bpf_prog_3b185187f1855c4c_F
   1: (b7) r0 = 1
   2: (95) exit
   3: (b7) r0 = 2
   4: (95) exit

And finally, an example for tail calls that is now working
as well wrt correlation:

  # bpftool prog dump xlated id 2
  [...]
  10: (b7) r2 = 8
  11: (85) call bpf_trace_printk#-41312
  12: (bf) r1 = r6
  13: (18) r2 = map[id:1]
  15: (b7) r3 = 0
  16: (85) call bpf_tail_call#12
  17: (b7) r1 = 42
  18: (6b) *(u16 *)(r6 +46) = r1
  19: (b7) r0 = 0
  20: (95) exit

  # bpftool map show id 1
  1: prog_array  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-20 18:09:40 -08:00
Daniel Borkmann
4f74d80971 bpf: fix kallsyms handling for subprogs
Right now kallsyms handling is not working with JITed subprogs.
The reason is that when in 1c2a088a66 ("bpf: x64: add JIT support
for multi-function programs") in jit_subprogs() they are passed
to bpf_prog_kallsyms_add(), then their prog type is 0, which BPF
core will think it's a cBPF program as only cBPF programs have a
0 type. Thus, they need to inherit the type from the main prog.

Once that is fixed, they are indeed added to the BPF kallsyms
infra, but their tag is 0. Therefore, since intention is to add
them as bpf_prog_F_<tag>, we need to pass them to bpf_prog_calc_tag()
first. And once this is resolved, there is a use-after-free on
prog cleanup: we remove the kallsyms entry from the main prog,
later walk all subprogs and call bpf_jit_free() on them. However,
the kallsyms linkage was never released on them. Thus, do that
for all subprogs right in __bpf_prog_put() when refcount hits 0.

Fixes: 1c2a088a66 ("bpf: x64: add JIT support for multi-function programs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-20 18:09:40 -08:00
Alexei Starovoitov
82abbf8d2f bpf: do not allow root to mangle valid pointers
Do not allow root to convert valid pointers into unknown scalars.
In particular disallow:
 ptr &= reg
 ptr <<= reg
 ptr += ptr
and explicitly allow:
 ptr -= ptr
since pkt_end - pkt == length

1.
This minimizes amount of address leaks root can do.
In the future may need to further tighten the leaks with kptr_restrict.

2.
If program has such pointer math it's likely a user mistake and
when verifier complains about it right away instead of many instructions
later on invalid memory access it's easier for users to fix their progs.

3.
when register holding a pointer cannot change to scalar it allows JITs to
optimize better. Like 32-bit archs could use single register for pointers
instead of a pair required to hold 64-bit scalars.

4.
reduces architecture dependent behavior. Since code:
r1 = r10;
r1 &= 0xff;
if (r1 ...)
will behave differently arm64 vs x64 and offloaded vs native.

A significant chunk of ptr mangling was allowed by
commit f1174f77b5 ("bpf/verifier: rework value tracking")
yet some of it was allowed even earlier.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-21 02:26:29 +01:00
Alexei Starovoitov
bb7f0f989c bpf: fix integer overflows
There were various issues related to the limited size of integers used in
the verifier:
 - `off + size` overflow in __check_map_access()
 - `off + reg->off` overflow in check_mem_access()
 - `off + reg->var_off.value` overflow or 32-bit truncation of
   `reg->var_off.value` in check_mem_access()
 - 32-bit truncation in check_stack_boundary()

Make sure that any integer math cannot overflow by not allowing
pointer math with large values.

Also reduce the scope of "scalar op scalar" tracking.

Fixes: f1174f77b5 ("bpf/verifier: rework value tracking")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-21 02:15:41 +01:00
Jann Horn
179d1c5602 bpf: don't prune branches when a scalar is replaced with a pointer
This could be made safe by passing through a reference to env and checking
for env->allow_ptr_leaks, but it would only work one way and is probably
not worth the hassle - not doing it will not directly lead to program
rejection.

Fixes: f1174f77b5 ("bpf/verifier: rework value tracking")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-21 02:15:41 +01:00
Jann Horn
a5ec6ae161 bpf: force strict alignment checks for stack pointers
Force strict alignment checks for stack pointers because the tracking of
stack spills relies on it; unaligned stack accesses can lead to corruption
of spilled registers, which is exploitable.

Fixes: f1174f77b5 ("bpf/verifier: rework value tracking")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-21 02:15:41 +01:00
Jann Horn
ea25f914dc bpf: fix missing error return in check_stack_boundary()
Prevent indirect stack accesses at non-constant addresses, which would
permit reading and corrupting spilled pointers.

Fixes: f1174f77b5 ("bpf/verifier: rework value tracking")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-21 02:15:41 +01:00
Jann Horn
468f6eafa6 bpf: fix 32-bit ALU op verification
32-bit ALU ops operate on 32-bit values and have 32-bit outputs.
Adjust the verifier accordingly.

Fixes: f1174f77b5 ("bpf/verifier: rework value tracking")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-21 02:15:41 +01:00
Jann Horn
0c17d1d2c6 bpf: fix incorrect tracking of register size truncation
Properly handle register truncation to a smaller size.

The old code first mirrors the clearing of the high 32 bits in the bitwise
tristate representation, which is correct. But then, it computes the new
arithmetic bounds as the intersection between the old arithmetic bounds and
the bounds resulting from the bitwise tristate representation. Therefore,
when coerce_reg_to_32() is called on a number with bounds
[0xffff'fff8, 0x1'0000'0007], the verifier computes
[0xffff'fff8, 0xffff'ffff] as bounds of the truncated number.
This is incorrect: The truncated number could also be in the range [0, 7],
and no meaningful arithmetic bounds can be computed in that case apart from
the obvious [0, 0xffff'ffff].

Starting with v4.14, this is exploitable by unprivileged users as long as
the unprivileged_bpf_disabled sysctl isn't set.

Debian assigned CVE-2017-16996 for this issue.

v2:
 - flip the mask during arithmetic bounds calculation (Ben Hutchings)
v3:
 - add CVE number (Ben Hutchings)

Fixes: b03c9f9fdc ("bpf/verifier: track signed and unsigned min/max values")
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-21 02:15:41 +01:00
Jann Horn
95a762e2c8 bpf: fix incorrect sign extension in check_alu_op()
Distinguish between
BPF_ALU64|BPF_MOV|BPF_K (load 32-bit immediate, sign-extended to 64-bit)
and BPF_ALU|BPF_MOV|BPF_K (load 32-bit immediate, zero-padded to 64-bit);
only perform sign extension in the first case.

Starting with v4.14, this is exploitable by unprivileged users as long as
the unprivileged_bpf_disabled sysctl isn't set.

Debian assigned CVE-2017-16995 for this issue.

v3:
 - add CVE number (Ben Hutchings)

Fixes: 484611357c ("bpf: allow access into map value arrays")
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-21 02:15:41 +01:00
Edward Cree
4374f256ce bpf/verifier: fix bounds calculation on BPF_RSH
Incorrect signed bounds were being computed.
If the old upper signed bound was positive and the old lower signed bound was
negative, this could cause the new upper signed bound to be too low,
leading to security issues.

Fixes: b03c9f9fdc ("bpf/verifier: track signed and unsigned min/max values")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
[jannh@google.com: changed description to reflect bug impact]
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-21 02:15:41 +01:00
Yonghong Song
06ef0ccb5a bpf/cgroup: fix a verification error for a CGROUP_DEVICE type prog
The tools/testing/selftests/bpf test program
test_dev_cgroup fails with the following error
when compiled with llvm 6.0. (I did not try
with earlier versions.)

  libbpf: load bpf program failed: Permission denied
  libbpf: -- BEGIN DUMP LOG ---
  libbpf:
  0: (61) r2 = *(u32 *)(r1 +4)
  1: (b7) r0 = 0
  2: (55) if r2 != 0x1 goto pc+8
   R0=inv0 R1=ctx(id=0,off=0,imm=0) R2=inv1 R10=fp0
  3: (69) r2 = *(u16 *)(r1 +0)
  invalid bpf_context access off=0 size=2
  ...

The culprit is the following statement in dev_cgroup.c:
  short type = ctx->access_type & 0xFFFF;
This code is typical as the ctx->access_type is assigned
as below in kernel/bpf/cgroup.c:
  struct bpf_cgroup_dev_ctx ctx = {
        .access_type = (access << 16) | dev_type,
        .major = major,
        .minor = minor,
  };

The compiler converts it to u16 access while
the verifier cgroup_dev_is_valid_access rejects
any non u32 access.

This patch permits the field access_type to be accessible
with type u16 and u8 as well.

Signed-off-by: Yonghong Song <yhs@fb.com>
Tested-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-19 01:43:29 +01:00
Colin Ian King
fa2d41adb9 bpf: make function skip_callee static and return NULL rather than 0
Function skip_callee is local to the source and does not need to
be in global scope, so make it static. Also return NULL rather than 0.
Cleans up two sparse warnings:

symbol 'skip_callee' was not declared. Should it be static?
Using plain integer as NULL pointer

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-19 01:26:04 +01:00
Colin Ian King
e90004d56b bpf: fix spelling mistake: "funcation"-> "function"
Trivial fix to spelling mistake in error message text.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-19 01:22:59 +01:00
David S. Miller
59436c9ee1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2017-12-18

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Allow arbitrary function calls from one BPF function to another BPF function.
   As of today when writing BPF programs, __always_inline had to be used in
   the BPF C programs for all functions, unnecessarily causing LLVM to inflate
   code size. Handle this more naturally with support for BPF to BPF calls
   such that this __always_inline restriction can be overcome. As a result,
   it allows for better optimized code and finally enables to introduce core
   BPF libraries in the future that can be reused out of different projects.
   x86 and arm64 JIT support was added as well, from Alexei.

2) Add infrastructure for tagging functions as error injectable and allow for
   BPF to return arbitrary error values when BPF is attached via kprobes on
   those. This way of injecting errors generically eases testing and debugging
   without having to recompile or restart the kernel. Tags for opting-in for
   this facility are added with BPF_ALLOW_ERROR_INJECTION(), from Josef.

3) For BPF offload via nfp JIT, add support for bpf_xdp_adjust_head() helper
   call for XDP programs. First part of this work adds handling of BPF
   capabilities included in the firmware, and the later patches add support
   to the nfp verifier part and JIT as well as some small optimizations,
   from Jakub.

4) The bpftool now also gets support for basic cgroup BPF operations such
   as attaching, detaching and listing current BPF programs. As a requirement
   for the attach part, bpftool can now also load object files through
   'bpftool prog load'. This reuses libbpf which we have in the kernel tree
   as well. bpftool-cgroup man page is added along with it, from Roman.

5) Back then commit e87c6bc385 ("bpf: permit multiple bpf attachments for
   a single perf event") added support for attaching multiple BPF programs
   to a single perf event. Given they are configured through perf's ioctl()
   interface, the interface has been extended with a PERF_EVENT_IOC_QUERY_BPF
   command in this work in order to return an array of one or multiple BPF
   prog ids that are currently attached, from Yonghong.

6) Various minor fixes and cleanups to the bpftool's Makefile as well
   as a new 'uninstall' and 'doc-uninstall' target for removing bpftool
   itself or prior installed documentation related to it, from Quentin.

7) Add CONFIG_CGROUP_BPF=y to the BPF kernel selftest config file which is
   required for the test_dev_cgroup test case to run, from Naresh.

8) Fix reporting of XDP prog_flags for nfp driver, from Jakub.

9) Fix libbpf's exit code from the Makefile when libelf was not found in
   the system, also from Jakub.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 10:51:06 -05:00
Alexei Starovoitov
1c2a088a66 bpf: x64: add JIT support for multi-function programs
Typical JIT does several passes over bpf instructions to
compute total size and relative offsets of jumps and calls.
With multitple bpf functions calling each other all relative calls
will have invalid offsets intially therefore we need to additional
last pass over the program to emit calls with correct offsets.
For example in case of three bpf functions:
main:
  call foo
  call bpf_map_lookup
  exit
foo:
  call bar
  exit
bar:
  exit

We will call bpf_int_jit_compile() indepedently for main(), foo() and bar()
x64 JIT typically does 4-5 passes to converge.
After these initial passes the image for these 3 functions
will be good except call targets, since start addresses of
foo() and bar() are unknown when we were JITing main()
(note that call bpf_map_lookup will be resolved properly
during initial passes).
Once start addresses of 3 functions are known we patch
call_insn->imm to point to right functions and call
bpf_int_jit_compile() again which needs only one pass.
Additional safety checks are done to make sure this
last pass doesn't produce image that is larger or smaller
than previous pass.

When constant blinding is on it's applied to all functions
at the first pass, since doing it once again at the last
pass can change size of the JITed code.

Tested on x64 and arm64 hw with JIT on/off, blinding on/off.
x64 jits bpf-to-bpf calls correctly while arm64 falls back to interpreter.
All other JITs that support normal BPF_CALL will behave the same way
since bpf-to-bpf call is equivalent to bpf-to-kernel call from
JITs point of view.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17 20:34:36 +01:00
Alexei Starovoitov
60b58afc96 bpf: fix net.core.bpf_jit_enable race
global bpf_jit_enable variable is tested multiple times in JITs,
blinding and verifier core. The malicious root can try to toggle
it while loading the programs. This race condition was accounted
for and there should be no issues, but it's safer to avoid
this race condition.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17 20:34:36 +01:00
Alexei Starovoitov
1ea47e01ad bpf: add support for bpf_call to interpreter
though bpf_call is still the same call instruction and
calling convention 'bpf to bpf' and 'bpf to helper' is the same
the interpreter has to oparate on 'struct bpf_insn *'.
To distinguish these two cases add a kernel internal opcode and
mark call insns with it.
This opcode is seen by interpreter only. JITs will never see it.
Also add tiny bit of debug code to aid interpreter debugging.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17 20:34:36 +01:00
Alexei Starovoitov
cc2b14d510 bpf: teach verifier to recognize zero initialized stack
programs with function calls are often passing various
pointers via stack. When all calls are inlined llvm
flattens stack accesses and optimizes away extra branches.
When functions are not inlined it becomes the job of
the verifier to recognize zero initialized stack to avoid
exploring paths that program will not take.
The following program would fail otherwise:

ptr = &buffer_on_stack;
*ptr = 0;
...
func_call(.., ptr, ...) {
  if (..)
    *ptr = bpf_map_lookup();
}
...
if (*ptr != 0) {
  // Access (*ptr)->field is valid.
  // Without stack_zero tracking such (*ptr)->field access
  // will be rejected
}

since stack slots are no longer uniform invalid | spill | misc
add liveness marking to all slots, but do it in 8 byte chunks.
So if nothing was read or written in [fp-16, fp-9] range
it will be marked as LIVE_NONE.
If any byte in that range was read, it will be marked LIVE_READ
and stacksafe() check will perform byte-by-byte verification.
If all bytes in the range were written the slot will be
marked as LIVE_WRITTEN.
This significantly speeds up state equality comparison
and reduces total number of states processed.

                    before   after
bpf_lb-DLB_L3.o       2051    2003
bpf_lb-DLB_L4.o       3287    3164
bpf_lb-DUNKNOWN.o     1080    1080
bpf_lxc-DDROP_ALL.o   24980   12361
bpf_lxc-DUNKNOWN.o    34308   16605
bpf_netdev.o          15404   10962
bpf_overlay.o         7191    6679

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17 20:34:35 +01:00
Alexei Starovoitov
f4d7e40a5b bpf: introduce function calls (verification)
Allow arbitrary function calls from bpf function to another bpf function.

To recognize such set of bpf functions the verifier does:
1. runs control flow analysis to detect function boundaries
2. proceeds with verification of all functions starting from main(root) function
It recognizes that the stack of the caller can be accessed by the callee
(if the caller passed a pointer to its stack to the callee) and the callee
can store map_value and other pointers into the stack of the caller.
3. keeps track of the stack_depth of each function to make sure that total
stack depth is still less than 512 bytes
4. disallows pointers to the callee stack to be stored into the caller stack,
since they will be invalid as soon as the callee returns
5. to reuse all of the existing state_pruning logic each function call
is considered to be independent call from the verifier point of view.
The verifier pretends to inline all function calls it sees are being called.
It stores the callsite instruction index as part of the state to make sure
that two calls to the same callee from two different places in the caller
will be different from state pruning point of view
6. more safety checks are added to liveness analysis

Implementation details:
. struct bpf_verifier_state is now consists of all stack frames that
  led to this function
. struct bpf_func_state represent one stack frame. It consists of
  registers in the given frame and its stack
. propagate_liveness() logic had a premature optimization where
  mark_reg_read() and mark_stack_slot_read() were manually inlined
  with loop iterating over parents for each register or stack slot.
  Undo this optimization to reuse more complex mark_*_read() logic
. skip_callee() logic is not necessary from safety point of view,
  but without it mark_*_read() markings become too conservative,
  since after returning from the funciton call a read of r6-r9
  will incorrectly propagate the read marks into callee causing
  inefficient pruning later
. mark_*_read() logic is now aware of control flow which makes it
  more complex. In the future the plan is to rewrite liveness
  to be hierarchical. So that liveness can be done within
  basic block only and control flow will be responsible for
  propagation of liveness information along cfg and between calls.
. tail_calls and ld_abs insns are not allowed in the programs with
  bpf-to-bpf calls
. returning stack pointers to the caller or storing them into stack
  frame of the caller is not allowed

Testing:
. no difference in cilium processed_insn numbers
. large number of tests follows in next patches

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17 20:34:35 +01:00
Alexei Starovoitov
cc8b0b92a1 bpf: introduce function calls (function boundaries)
Allow arbitrary function calls from bpf function to another bpf function.

Since the beginning of bpf all bpf programs were represented as a single function
and program authors were forced to use always_inline for all functions
in their C code. That was causing llvm to unnecessary inflate the code size
and forcing developers to move code to header files with little code reuse.

With a bit of additional complexity teach verifier to recognize
arbitrary function calls from one bpf function to another as long as
all of functions are presented to the verifier as a single bpf program.
New program layout:
r6 = r1    // some code
..
r1 = ..    // arg1
r2 = ..    // arg2
call pc+1  // function call pc-relative
exit
.. = r1    // access arg1
.. = r2    // access arg2
..
call pc+20 // second level of function call
...

It allows for better optimized code and finally allows to introduce
the core bpf libraries that can be reused in different projects,
since programs are no longer limited by single elf file.
With function calls bpf can be compiled into multiple .o files.

This patch is the first step. It detects programs that contain
multiple functions and checks that calls between them are valid.
It splits the sequence of bpf instructions (one program) into a set
of bpf functions that call each other. Calls to only known
functions are allowed. In the future the verifier may allow
calls to unresolved functions and will do dynamic linking.
This logic supports statically linked bpf functions only.

Such function boundary detection could have been done as part of
control flow graph building in check_cfg(), but it's cleaner to
separate function boundary detection vs control flow checks within
a subprogram (function) into logically indepedent steps.
Follow up patches may split check_cfg() further, but not check_subprogs().

Only allow bpf-to-bpf calls for root only and for non-hw-offloaded programs.
These restrictions can be relaxed in the future.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17 20:34:35 +01:00
David S. Miller
c30abd5e40 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Three sets of overlapping changes, two in the packet scheduler
and one in the meson-gxl PHY driver.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-16 22:11:55 -05:00
Daniel Borkmann
04514d1322 bpf: guarantee r1 to be ctx in case of bpf_helper_changes_pkt_data
Some JITs don't cache skb context on stack in prologue, so when
LD_ABS/IND is used and helper calls yield bpf_helper_changes_pkt_data()
as true, then they temporarily save/restore skb pointer. However,
the assumption that skb always has to be in r1 is a bit of a
gamble. Right now it turned out to be true for all helpers listed
in bpf_helper_changes_pkt_data(), but lets enforce that from verifier
side, so that we make this a guarantee and bail out if the func
proto is misconfigured in future helpers.

In case of BPF helper calls from cBPF, bpf_helper_changes_pkt_data()
is completely unrelevant here (since cBPF is context read-only) and
therefore always false.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-15 09:19:35 -08:00
Eric Dumazet
9147efcbe0 bpf: add schedule points to map alloc/free
While using large percpu maps, htab_map_alloc() can hold
cpu for hundreds of ms.

This patch adds cond_resched() calls to percpu alloc/free
call sites, all running in process context.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-12 15:27:22 -08:00
Josef Bacik
9802d86585 bpf: add a bpf_override_function helper
Error injection is sloppy and very ad-hoc.  BPF could fill this niche
perfectly with it's kprobe functionality.  We could make sure errors are
only triggered in specific call chains that we care about with very
specific situations.  Accomplish this with the bpf_override_funciton
helper.  This will modify the probe'd callers return value to the
specified value and set the PC to an override function that simply
returns, bypassing the originally probed function.  This gives us a nice
clean way to implement systematic error injection for all of our code
paths.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-12 09:02:34 -08:00
Yonghong Song
f371b304f1 bpf/tracing: allow user space to query prog array on the same tp
Commit e87c6bc385 ("bpf: permit multiple bpf attachments
for a single perf event") added support to attach multiple
bpf programs to a single perf event.
Although this provides flexibility, users may want to know
what other bpf programs attached to the same tp interface.
Besides getting visibility for the underlying bpf system,
such information may also help consolidate multiple bpf programs,
understand potential performance issues due to a large array,
and debug (e.g., one bpf program which overwrites return code
may impact subsequent program results).

Commit 2541517c32 ("tracing, perf: Implement BPF programs
attached to kprobes") utilized the existing perf ioctl
interface and added the command PERF_EVENT_IOC_SET_BPF
to attach a bpf program to a tracepoint. This patch adds a new
ioctl command, given a perf event fd, to query the bpf program
array attached to the same perf tracepoint event.

The new uapi ioctl command:
  PERF_EVENT_IOC_QUERY_BPF

The new uapi/linux/perf_event.h structure:
  struct perf_event_query_bpf {
       __u32	ids_len;
       __u32	prog_cnt;
       __u32	ids[0];
  };

User space provides buffer "ids" for kernel to copy to.
When returning from the kernel, the number of available
programs in the array is set in "prog_cnt".

The usage:
  struct perf_event_query_bpf *query =
    malloc(sizeof(*query) + sizeof(u32) * ids_len);
  query.ids_len = ids_len;
  err = ioctl(pmu_efd, PERF_EVENT_IOC_QUERY_BPF, query);
  if (err == 0) {
    /* query.prog_cnt is the number of available progs,
     * number of progs in ids: (ids_len == 0) ? 0 : query.prog_cnt
     */
  } else if (errno == ENOSPC) {
    /* query.ids_len number of progs copied,
     * query.prog_cnt is the number of available progs
     */
  } else {
      /* other errors */
  }

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-12 08:46:40 -08:00
David S. Miller
7cda4cee13 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Small overlapping change conflict ('net' changed a line,
'net-next' added a line right afterwards) in flexcan.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05 10:44:19 -05:00
Alexei Starovoitov
914cb781ee bpf: cleanup register_is_null()
don't pass large struct bpf_reg_state by value.
Instead pass it by pointer.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-01 11:25:10 +01:00
Alexei Starovoitov
3bf15921c5 bpf: improve JEQ/JNE path walking
verifier knows how to trim paths that are known not to be
taken at run-time when register containing run-time constant
is compared with another constant.
It was done only for JEQ comparison.
Extend it to include JNE as well.
More cases can be added in the future.

                     before  after
bpf_lb-DLB_L3.o       2270    2051
bpf_lb-DLB_L4.o       3682    3287
bpf_lb-DUNKNOWN.o     1110    1080
bpf_lxc-DDROP_ALL.o   27876   24980
bpf_lxc-DUNKNOWN.o    38780   34308
bpf_netdev.o          16937   15404
bpf_overlay.o         7929    7191

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-01 11:25:10 +01:00
Alexei Starovoitov
2f18f62ee1 bpf: improve verifier liveness marks
registers with pointers filled from stack were missing live_written marks
which caused liveness propagation to unnecessary mark more registers as
live_read and miss state pruning opportunities later on.

                     before  after
bpf_lb-DLB_L3.o       2285   2270
bpf_lb-DLB_L4.o       3723   3682
bpf_lb-DUNKNOWN.o     1110   1110
bpf_lxc-DDROP_ALL.o   27954  27876
bpf_lxc-DUNKNOWN.o    38954  38780
bpf_netdev.o          16943  16937
bpf_overlay.o         7929   7929

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-01 11:25:10 +01:00
Alexei Starovoitov
19ceb4178d bpf: don't mark FP reg as uninit
when verifier hits an internal bug don't mark register R10==FP as uninit,
since it's read only register and it's not technically correct to let
verifier run further, since it may assume that R10 has valid auxiliary state.

While developing subsequent patches this issue was discovered,
though the code eventually changed that aux reg state doesn't have
pointers any more it is still safer to avoid clearing readonly register.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-01 11:25:10 +01:00
Alexei Starovoitov
4e92024a48 bpf: print liveness info to verifier log
let verifier print register and stack liveness information
into verifier log

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-01 11:25:10 +01:00
Alexei Starovoitov
12a3cc8424 bpf: fix stack state printing in verifier log
fix incorrect stack state prints in print_verifier_state()

Fixes: 638f5b90d4 ("bpf: reduce verifier memory consumption")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-01 11:25:10 +01:00
Yonghong Song
c8c088ba0e bpf: set maximum number of attached progs to 64 for a single perf tp
cgropu+bpf prog array has a maximum number of 64 programs.
Let us apply the same limit here.

Fixes: e87c6bc385 ("bpf: permit multiple bpf attachments for a single perf event")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-01 02:56:10 +01:00
Jakub Kicinski
a39e17b2d8 bpf: offload: add a license header
I forgot to add a license on kernel/bpf/offload.c.  Luckily I'm
still the only author so make it explicitly GPLv2.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-27 22:24:51 +01:00
Alexei Starovoitov
c131187db2 bpf: fix branch pruning logic
when the verifier detects that register contains a runtime constant
and it's compared with another constant it will prune exploration
of the branch that is guaranteed not to be taken at runtime.
This is all correct, but malicious program may be constructed
in such a way that it always has a constant comparison and
the other branch is never taken under any conditions.
In this case such path through the program will not be explored
by the verifier. It won't be taken at run-time either, but since
all instructions are JITed the malicious program may cause JITs
to complain about using reserved fields, etc.
To fix the issue we have to track the instructions explored by
the verifier and sanitize instructions that are dead at run time
with NOPs. We cannot reject such dead code, since llvm generates
it for valid C code, since it doesn't do as much data flow
analysis as the verifier does.

Fixes: 17a5267067 ("bpf: verifier (add verifier core)")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-23 10:56:35 +01:00
Gianluca Borello
db1ac4964f bpf: introduce ARG_PTR_TO_MEM_OR_NULL
With the current ARG_PTR_TO_MEM/ARG_PTR_TO_UNINIT_MEM semantics, an helper
argument can be NULL when the next argument type is ARG_CONST_SIZE_OR_ZERO
and the verifier can prove the value of this next argument is 0. However,
most helpers are just interested in handling <!NULL, 0>, so forcing them to
deal with <NULL, 0> makes the implementation of those helpers more
complicated for no apparent benefits, requiring them to explicitly handle
those corner cases with checks that bpf programs could start relying upon,
preventing the possibility of removing them later.

Solve this by making ARG_PTR_TO_MEM/ARG_PTR_TO_UNINIT_MEM never accept NULL
even when ARG_CONST_SIZE_OR_ZERO is set, and introduce a new argument type
ARG_PTR_TO_MEM_OR_NULL to explicitly deal with the NULL case.

Currently, the only helper that needs this is bpf_csum_diff_proto(), so
change arg1 and arg3 to this new type as well.

Also add a new battery of tests that explicitly test the
!ARG_PTR_TO_MEM_OR_NULL combination: all the current ones testing the
various <NULL, 0> variations are focused on bpf_csum_diff, so cover also
other helpers.

Signed-off-by: Gianluca Borello <g.borello@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-22 21:40:54 +01:00
Jakub Kicinski
1ee640095f bpf: revert report offload info to user space
This reverts commit bd601b6ada ("bpf: report offload info to user
space").  The ifindex by itself is not sufficient, we should provide
information on which network namespace this ifindex belongs to.
After considering some options we concluded that it's best to just
remove this API for now, and rework it in -next.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-21 00:37:35 +01:00