kernel_optimize_test

Author	SHA1	Message	Date
Alexei Starovoitov	7ae457c1e5	net: filter: split 'struct sk_filter' into socket and bpf parts clean up names related to socket filtering and bpf in the following way: - everything that deals with sockets keeps 'sk_' prefix - everything that is pure BPF is changed to 'bpf_' prefix split 'struct sk_filter' into struct sk_filter { atomic_t refcnt; struct rcu_head rcu; struct bpf_prog prog; }; and struct bpf_prog { u32 jited:1, len:31; struct sock_fprog_kern orig_prog; unsigned int (bpf_func)(const struct sk_buff skb, const struct bpf_insn filter); union { struct sock_filter insns[0]; struct bpf_insn insnsi[0]; struct work_struct work; }; }; so that 'struct bpf_prog' can be used independent of sockets and cleans up 'unattached' bpf use cases split SK_RUN_FILTER macro into: SK_RUN_FILTER to be used with 'struct sk_filter ' and BPF_PROG_RUN to be used with 'struct bpf_prog ' __sk_filter_release(struct sk_filter ) gains __bpf_prog_release(struct bpf_prog ) helper function also perform related renames for the functions that work with 'struct bpf_prog ', since they're on the same lines: sk_filter_size -> bpf_prog_size sk_filter_select_runtime -> bpf_prog_select_runtime sk_filter_free -> bpf_prog_free sk_unattached_filter_create -> bpf_prog_create sk_unattached_filter_destroy -> bpf_prog_destroy sk_store_orig_filter -> bpf_prog_store_orig_filter sk_release_orig_filter -> bpf_release_orig_filter __sk_migrate_filter -> bpf_migrate_filter __sk_prepare_filter -> bpf_prepare_filter API for attaching classic BPF to a socket stays the same: sk_attach_filter(prog, struct sock )/sk_detach_filter(struct sock ) and SK_RUN_FILTER(struct sk_filter , ctx) to execute a program which is used by sockets, tun, af_packet API for 'unattached' BPF programs becomes: bpf_prog_create(struct bpf_prog )/bpf_prog_destroy(struct bpf_prog ) and BPF_PROG_RUN(struct bpf_prog *, ctx) to execute a program which is used by isdn, ppp, team, seccomp, ptp, xt_bpf, cls_bpf, test_bpf Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-02 15:03:58 -07:00
Daniel Borkmann	3480593131	net: filter: get rid of BPF_S_* enum This patch finally allows us to get rid of the BPF_S_* enum. Currently, the code performs unnecessary encode and decode workarounds in seccomp and filter migration itself when a filter is being attached in order to overcome BPF_S_* encoding which is not used anymore by the new interpreter resp. JIT compilers. Keeping it around would mean that also in future we would need to extend and maintain this enum and related encoders/decoders. We can get rid of all that and save us these operations during filter attaching. Naturally, also JIT compilers need to be updated by this. Before JIT conversion is being done, each compiler checks if A is being loaded at startup to obtain information if it needs to emit instructions to clear A first. Since BPF extensions are a subset of BPF_LD \| BPF_{W,H,B} \| BPF_ABS variants, case statements for extensions can be removed at that point. To ease and minimalize code changes in the classic JITs, we have introduced bpf_anc_helper(). Tested with test_bpf on x86_64 (JIT, int), s390x (JIT, int), arm (JIT, int), i368 (int), ppc64 (JIT, int); for sparc we unfortunately didn't have access, but changes are analogous to the rest. Joint work with Alexei Starovoitov. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mircea Gherzan <mgherzan@gmail.com> Cc: Kees Cook <keescook@chromium.org> Acked-by: Chema Gonzalez <chemag@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-01 22:16:58 -07:00
Heiko Carstens	e84d2f8d2a	net: filter: s390: fix JIT address randomization This is the s390 variant of Alexei's JIT bug fix. (patch description below stolen from Alexei's patch) bpf_alloc_binary() adds 128 bytes of room to JITed program image and rounds it up to the nearest page size. If image size is close to page size (like 4000), it is rounded to two pages: round_up(4000 + 4 + 128) == 8192 then 'hole' is computed as 8192 - (4000 + 4) = 4188 If prandom_u32() % hole selects a number >= PAGE_SIZE - sizeof(header) then kernel will crash during bpf_jit_free(): kernel BUG at arch/x86/mm/pageattr.c:887! Call Trace: [<ffffffff81037285>] change_page_attr_set_clr+0x135/0x460 [<ffffffff81694cc0>] ? _raw_spin_unlock_irq+0x30/0x50 [<ffffffff810378ff>] set_memory_rw+0x2f/0x40 [<ffffffffa01a0d8d>] bpf_jit_free_deferred+0x2d/0x60 [<ffffffff8106bf98>] process_one_work+0x1d8/0x6a0 [<ffffffff8106bf38>] ? process_one_work+0x178/0x6a0 [<ffffffff8106c90c>] worker_thread+0x11c/0x370 since bpf_jit_free() does: unsigned long addr = (unsigned long)fp->bpf_func & PAGE_MASK; struct bpf_binary_header header = (void *)addr; to compute start address of 'bpf_binary_header' and header->pages will pass junk to: set_memory_rw(addr, header->pages); Fix it by making sure that &header->image[prandom_u32() % hole] and &header are in the same page. Fixes: `aa2d2c73c2` ("s390/bpf,jit: address randomize and write protect jit code") Reported-by: Alexei Starovoitov <ast@plumgrid.com> Cc: <stable@vger.kernel.org> # v3.11+ Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-14 16:10:16 -04:00
Martin Schwidefsky	6e0de81759	s390/bpf,jit: initialize A register if 1st insn is BPF_S_LDX_B_MSH The A register needs to be initialized to zero in the prolog if the first instruction of the BPF program is BPF_S_LDX_B_MSH to prevent leaking the content of %r5 to user space. Cc: <stable@vger.kernel.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2014-04-25 14:03:25 +02:00
Daniel Borkmann	f8bbbfc3b9	net: filter: add jited flag to indicate jit compiled filters This patch adds a jited flag into sk_filter struct in order to indicate whether a filter is currently jited or not. The size of sk_filter is not being expanded as the 32 bit 'len' member allows upper bits to be reused since a filter can currently only grow as large as BPF_MAXINSNS. Therefore, there's enough room also for other in future needed flags to reuse 'len' field if necessary. The jited flag also allows for having alternative interpreter functions running as currently, we can only detect jit compiled filters by testing fp->bpf_func to not equal the address of sk_run_filter(). Joint work with Alexei Starovoitov. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-31 00:45:08 -04:00
Tom Herbert	61b905da33	net: Rename skb->rxhash to skb->hash The packet hash can be considered a property of the packet, not just on RX path. This patch changes name of rxhash and l4_rxhash skbuff fields to be hash and l4_hash respectively. This includes changing uses of the field in the code which don't call the access functions. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-26 15:58:20 -04:00
Heiko Carstens	3af57f78c3	s390/bpf,jit: fix 32 bit divisions, use unsigned divide instructions The s390 bpf jit compiler emits the signed divide instructions "dr" and "d" for unsigned divisions. This can cause problems: the dividend will be zero extended to a 64 bit value and the divisor is the 32 bit signed value as specified A or X accumulator, even though A and X are supposed to be treated as unsigned values. The divide instrunctions will generate an exception if the result cannot be expressed with a 32 bit signed value. This is the case if e.g. the dividend is 0xffffffff and the divisor either 1 or also 0xffffffff (signed: -1). To avoid all these issues simply use unsigned divide instructions. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-17 18:54:49 -08:00
Eric Dumazet	aee636c480	bpf: do not use reciprocal divide At first Jakub Zawadzki noticed that some divisions by reciprocal_divide were not correct. (off by one in some cases) http://www.wireshark.org/~darkjames/reciprocal-buggy.c He could also show this with BPF: http://www.wireshark.org/~darkjames/set-and-dump-filter-k-bug.c The reciprocal divide in linux kernel is not generic enough, lets remove its use in BPF, as it is not worth the pain with current cpus. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Jakub Zawadzki <darkjames-ws@darkjames.pl> Cc: Mircea Gherzan <mgherzan@gmail.com> Cc: Daniel Borkmann <dxchgb@gmail.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Matt Evans <matt@ozlabs.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-15 17:02:08 -08:00
Martin Schwidefsky	6cef30034c	s390/bpf,jit: fix prolog oddity The prolog of functions generated by the bpf jit compiler uses an instruction sequence with an "ahi" instruction to create stack space instead of using an "aghi" instruction. Using the 32-bit "ahi" is not wrong as the stack we are operating on is an order-4 allocation which is always aligned to 16KB. But it is more consistent to use an "aghi" as the stack pointer is a 64-bit value. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2013-10-24 17:16:59 +02:00
Heiko Carstens	0f20822a69	s390/dis: move disassembler function prototypes to proper header file Now that the in-kernel disassembler has an own header file move the disassembler related function prototypes to that header file. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2013-10-24 17:16:48 +02:00
Alexei Starovoitov	d45ed4a4e3	net: fix unsafe set_memory_rw from softirq on x86 system with net.core.bpf_jit_enable = 1 sudo tcpdump -i eth1 'tcp port 22' causes the warning: [ 56.766097] Possible unsafe locking scenario: [ 56.766097] [ 56.780146] CPU0 [ 56.786807] ---- [ 56.793188] lock(&(&vb->lock)->rlock); [ 56.799593] <Interrupt> [ 56.805889] lock(&(&vb->lock)->rlock); [ 56.812266] [ 56.812266] * DEADLOCK * [ 56.812266] [ 56.830670] 1 lock held by ksoftirqd/1/13: [ 56.836838] #0: (rcu_read_lock){.+.+..}, at: [<ffffffff8118f44c>] vm_unmap_aliases+0x8c/0x380 [ 56.849757] [ 56.849757] stack backtrace: [ 56.862194] CPU: 1 PID: 13 Comm: ksoftirqd/1 Not tainted 3.12.0-rc3+ #45 [ 56.868721] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012 [ 56.882004] ffffffff821944c0 ffff88080bbdb8c8 ffffffff8175a145 0000000000000007 [ 56.895630] ffff88080bbd5f40 ffff88080bbdb928 ffffffff81755b14 0000000000000001 [ 56.909313] ffff880800000001 ffff880800000000 ffffffff8101178f 0000000000000001 [ 56.923006] Call Trace: [ 56.929532] [<ffffffff8175a145>] dump_stack+0x55/0x76 [ 56.936067] [<ffffffff81755b14>] print_usage_bug+0x1f7/0x208 [ 56.942445] [<ffffffff8101178f>] ? save_stack_trace+0x2f/0x50 [ 56.948932] [<ffffffff810cc0a0>] ? check_usage_backwards+0x150/0x150 [ 56.955470] [<ffffffff810ccb52>] mark_lock+0x282/0x2c0 [ 56.961945] [<ffffffff810ccfed>] __lock_acquire+0x45d/0x1d50 [ 56.968474] [<ffffffff810cce6e>] ? __lock_acquire+0x2de/0x1d50 [ 56.975140] [<ffffffff81393bf5>] ? cpumask_next_and+0x55/0x90 [ 56.981942] [<ffffffff810cef72>] lock_acquire+0x92/0x1d0 [ 56.988745] [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380 [ 56.995619] [<ffffffff817628f1>] _raw_spin_lock+0x41/0x50 [ 57.002493] [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380 [ 57.009447] [<ffffffff8118f52a>] vm_unmap_aliases+0x16a/0x380 [ 57.016477] [<ffffffff8118f44c>] ? vm_unmap_aliases+0x8c/0x380 [ 57.023607] [<ffffffff810436b0>] change_page_attr_set_clr+0xc0/0x460 [ 57.030818] [<ffffffff810cfb8d>] ? trace_hardirqs_on+0xd/0x10 [ 57.037896] [<ffffffff811a8330>] ? kmem_cache_free+0xb0/0x2b0 [ 57.044789] [<ffffffff811b59c3>] ? free_object_rcu+0x93/0xa0 [ 57.051720] [<ffffffff81043d9f>] set_memory_rw+0x2f/0x40 [ 57.058727] [<ffffffff8104e17c>] bpf_jit_free+0x2c/0x40 [ 57.065577] [<ffffffff81642cba>] sk_filter_release_rcu+0x1a/0x30 [ 57.072338] [<ffffffff811108e2>] rcu_process_callbacks+0x202/0x7c0 [ 57.078962] [<ffffffff81057f17>] __do_softirq+0xf7/0x3f0 [ 57.085373] [<ffffffff81058245>] run_ksoftirqd+0x35/0x70 cannot reuse jited filter memory, since it's readonly, so use original bpf insns memory to hold work_struct defer kfree of sk_filter until jit completed freeing tested on x86_64 and i386 Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-07 15:16:45 -04:00
Heiko Carstens	4784955a52	s390/bpf,jit: fix address randomization Add misssing braces to hole calculation. This resulted in an addition instead of an substraction. Which in turn means that the jit compiler could try to write out of bounds of the allocated piece of memory. This bug was introduced with `aa2d2c73` "s390/bpf,jit: address randomize and write protect jit code". Fixes this one: [ 37.320956] Unable to handle kernel pointer dereference at virtual kernel address 000003ff80231000 [ 37.320984] Oops: 0011 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 37.320993] Modules linked in: dm_multipath scsi_dh eadm_sch dm_mod ctcm fsm autofs4 [ 37.321007] CPU: 28 PID: 6443 Comm: multipathd Not tainted 3.10.9-61.x.20130829-s390xdefault #1 [ 37.321011] task: 0000004ada778000 ti: 0000004ae3304000 task.ti: 0000004ae3304000 [ 37.321014] Krnl PSW : 0704c00180000000 000000000012d1de (bpf_jit_compile+0x198e/0x23d0) [ 37.321022] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 EA:3 Krnl GPRS: 000000004350207d 0000004a00000001 0000000000000007 000003ff80231002 [ 37.321029] 0000000000000007 000003ff80230ffe 00000000a7740000 000003ff80230f76 [ 37.321032] 000003ffffffffff 000003ff00000000 000003ff0000007d 000000000071e820 [ 37.321035] 0000004adbe99950 000000000071ea18 0000004af3d9e7c0 0000004ae3307b80 [ 37.321046] Krnl Code: 000000000012d1d0: 41305004 la %r3,4(%r5) 000000000012d1d4: e330f0f80021 clg %r3,248(%r15) #000000000012d1da: a7240009 brc 2,12d1ec >000000000012d1de: 50805000 st %r8,0(%r5) 000000000012d1e2: e330f0f00004 lg %r3,240(%r15) 000000000012d1e8: 41303004 la %r3,4(%r3) 000000000012d1ec: e380f0e00004 lg %r8,224(%r15) 000000000012d1f2: e330f0f00024 stg %r3,240(%r15) [ 37.321074] Call Trace: [ 37.321077] ([<000000000012da78>] bpf_jit_compile+0x2228/0x23d0) [ 37.321083] [<00000000006007c2>] sk_attach_filter+0xfe/0x214 [ 37.321090] [<00000000005d2d92>] sock_setsockopt+0x926/0xbdc [ 37.321097] [<00000000005cbfb6>] SyS_setsockopt+0x8a/0xe8 [ 37.321101] [<00000000005ccaa8>] SyS_socketcall+0x264/0x364 [ 37.321106] [<0000000000713f1c>] sysc_nr_ok+0x22/0x28 [ 37.321113] [<000003fffce10ea8>] 0x3fffce10ea8 [ 37.321118] INFO: lockdep is turned off. [ 37.321121] Last Breaking-Event-Address: [ 37.321124] [<000000000012d192>] bpf_jit_compile+0x1942/0x23d0 [ 37.321132] [ 37.321135] Kernel panic - not syncing: Fatal exception: panic_on_oops Cc: stable@vger.kernel.org # v3.11 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2013-09-04 17:18:55 +02:00
Heiko Carstens	c9a7afa380	s390/bpf,jit: add pkt_type support s390 version of `3b58908a` "x86: bpf_jit_comp: add pkt_type support". Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2013-07-18 12:44:38 +02:00
Heiko Carstens	aa2d2c73c2	s390/bpf,jit: address randomize and write protect jit code This is the s390 variant of `314beb9b` "x86: bpf_jit_comp: secure bpf jit against spraying attacks". With this change the whole jit code and literal pool will be write protected after creation. In addition the start address of the jit code won't be always on a page boundary anymore. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2013-07-18 12:44:37 +02:00
Heiko Carstens	fee1b5488d	s390/bpf,jit: use generic jit dumper This is the s390 backend of `79617801` "filter: bpf_jit_comp: refactor and unify BPF JIT image dump output". Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2013-07-18 12:44:35 +02:00
Heiko Carstens	1eeb74782d	s390/bpf,jit: call module_free() from any context The workqueue workaround is no longer needed. Same as `5199dfe531` "sparc: bpf_jit_comp: can call module_free() from any context". Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2013-07-18 12:44:34 +02:00
Stelian Nirlu	3d04fea5e7	s390/bpf,jit: use kcalloc instead of kmalloc and memset Signed-off-by: Stelian Nirlu <steliannirlu@gmail.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2013-04-17 14:07:27 +02:00
Heiko Carstens	5303a0fe8c	s390/bpf,jit: add vlan tag support s390 version of `855ddb56` "x86: bpf_jit_comp: add vlan tag support". Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2013-02-14 15:55:20 +01:00
Heiko Carstens	916908df24	s390/bpf,jit: add support for XOR instruction Add support for XOR instruction for use with X/K. s390 JIT support for the new BPF_S_ALU_XOR_* instructions introduced with `9e49e889` "filter: add XOR instruction for use with X/K". Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2012-12-03 10:44:05 -05:00
Heiko Carstens	3247274536	s390/bpf,jit: add support MOD instruction Add support for MOD operation for s390's JIT. Same as `280050cc` "x86 bpf_jit: support MOD operation" for x86 which adds JIT support for the generic new MOD operation introduced with `b6069a9570` "filter: add MOD operation". Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2012-12-03 10:44:02 -05:00
Heiko Carstens	c59eed111b	s390/bpf,jit: add support for BPF_S_ANC_ALU_XOR_X instruction Add support for new BPF_S_ANC_ALU_XOR_X instruction which got added with `ffe06c17` "filter: add XOR operation". s390 version of `4bfaddf1` "x86 bpf_jit: support BPF_S_ANC_ALU_XOR_X instruction". Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2012-09-26 15:45:28 +02:00
Heiko Carstens	68d9884dbc	s390/bpf,jit: improve code generation Make use of new immediate instructions that came with the extended immediate and general instruction extension facilities. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2012-09-26 15:44:49 +02:00
Martin Schwidefsky	c10302efe5	s390/bpf,jit: BPF Just In Time compiler for s390 The s390 implementation of the JIT compiler for packet filter speedup. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2012-09-26 15:44:49 +02:00

23 Commits