kernel_optimize_test/samples/bpf
Daniel Wagner 0fb1170ee6 bpf: BPF based latency tracing
BPF offers another way to generate latency histograms. We attach
kprobes at trace_preempt_off and trace_preempt_on and calculate the
time it takes to from seeing the off/on transition.

The first array is used to store the start time stamp. The key is the
CPU id. The second array stores the log2(time diff). We need to use
static allocation here (array and not hash tables). The kprobes
hooking into trace_preempt_on|off should not calling any dynamic
memory allocation or free path. We need to avoid recursivly
getting called. Besides that, it reduces jitter in the measurement.

CPU 0
      latency        : count     distribution
       1 -> 1        : 0        |                                        |
       2 -> 3        : 0        |                                        |
       4 -> 7        : 0        |                                        |
       8 -> 15       : 0        |                                        |
      16 -> 31       : 0        |                                        |
      32 -> 63       : 0        |                                        |
      64 -> 127      : 0        |                                        |
     128 -> 255      : 0        |                                        |
     256 -> 511      : 0        |                                        |
     512 -> 1023     : 0        |                                        |
    1024 -> 2047     : 0        |                                        |
    2048 -> 4095     : 166723   |*************************************** |
    4096 -> 8191     : 19870    |***                                     |
    8192 -> 16383    : 6324     |                                        |
   16384 -> 32767    : 1098     |                                        |
   32768 -> 65535    : 190      |                                        |
   65536 -> 131071   : 179      |                                        |
  131072 -> 262143   : 18       |                                        |
  262144 -> 524287   : 4        |                                        |
  524288 -> 1048575  : 1363     |                                        |
CPU 1
      latency        : count     distribution
       1 -> 1        : 0        |                                        |
       2 -> 3        : 0        |                                        |
       4 -> 7        : 0        |                                        |
       8 -> 15       : 0        |                                        |
      16 -> 31       : 0        |                                        |
      32 -> 63       : 0        |                                        |
      64 -> 127      : 0        |                                        |
     128 -> 255      : 0        |                                        |
     256 -> 511      : 0        |                                        |
     512 -> 1023     : 0        |                                        |
    1024 -> 2047     : 0        |                                        |
    2048 -> 4095     : 114042   |*************************************** |
    4096 -> 8191     : 9587     |**                                      |
    8192 -> 16383    : 4140     |                                        |
   16384 -> 32767    : 673      |                                        |
   32768 -> 65535    : 179      |                                        |
   65536 -> 131071   : 29       |                                        |
  131072 -> 262143   : 4        |                                        |
  262144 -> 524287   : 1        |                                        |
  524288 -> 1048575  : 364      |                                        |
CPU 2
      latency        : count     distribution
       1 -> 1        : 0        |                                        |
       2 -> 3        : 0        |                                        |
       4 -> 7        : 0        |                                        |
       8 -> 15       : 0        |                                        |
      16 -> 31       : 0        |                                        |
      32 -> 63       : 0        |                                        |
      64 -> 127      : 0        |                                        |
     128 -> 255      : 0        |                                        |
     256 -> 511      : 0        |                                        |
     512 -> 1023     : 0        |                                        |
    1024 -> 2047     : 0        |                                        |
    2048 -> 4095     : 40147    |*************************************** |
    4096 -> 8191     : 2300     |*                                       |
    8192 -> 16383    : 828      |                                        |
   16384 -> 32767    : 178      |                                        |
   32768 -> 65535    : 59       |                                        |
   65536 -> 131071   : 2        |                                        |
  131072 -> 262143   : 0        |                                        |
  262144 -> 524287   : 1        |                                        |
  524288 -> 1048575  : 174      |                                        |
CPU 3
      latency        : count     distribution
       1 -> 1        : 0        |                                        |
       2 -> 3        : 0        |                                        |
       4 -> 7        : 0        |                                        |
       8 -> 15       : 0        |                                        |
      16 -> 31       : 0        |                                        |
      32 -> 63       : 0        |                                        |
      64 -> 127      : 0        |                                        |
     128 -> 255      : 0        |                                        |
     256 -> 511      : 0        |                                        |
     512 -> 1023     : 0        |                                        |
    1024 -> 2047     : 0        |                                        |
    2048 -> 4095     : 29626    |*************************************** |
    4096 -> 8191     : 2704     |**                                      |
    8192 -> 16383    : 1090     |                                        |
   16384 -> 32767    : 160      |                                        |
   32768 -> 65535    : 72       |                                        |
   65536 -> 131071   : 32       |                                        |
  131072 -> 262143   : 26       |                                        |
  262144 -> 524287   : 12       |                                        |
  524288 -> 1048575  : 298      |                                        |

All this is based on the trace3 examples written by
Alexei Starovoitov <ast@plumgrid.com>.

Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-23 06:09:58 -07:00
..
bpf_helpers.h bpf: introduce current->pid, tgid, uid, gid, comm accessors 2015-06-15 15:53:50 -07:00
bpf_load.c samples/bpf: bpf_tail_call example for tracing 2015-05-21 17:07:59 -04:00
bpf_load.h samples/bpf: Add simple non-portable kprobe filter example 2015-04-02 13:25:50 +02:00
lathist_kern.c bpf: BPF based latency tracing 2015-06-23 06:09:58 -07:00
lathist_user.c bpf: BPF based latency tracing 2015-06-23 06:09:58 -07:00
libbpf.c samples/bpf: Add simple non-portable kprobe filter example 2015-04-02 13:25:50 +02:00
libbpf.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2015-04-15 09:00:47 -07:00
Makefile bpf: BPF based latency tracing 2015-06-23 06:09:58 -07:00
sock_example.c samples/bpf: Add simple non-portable kprobe filter example 2015-04-02 13:25:50 +02:00
sockex1_kern.c samples: bpf: add skb->field examples and tests 2015-03-15 22:02:28 -04:00
sockex1_user.c samples: bpf: add skb->field examples and tests 2015-03-15 22:02:28 -04:00
sockex2_kern.c samples: bpf: add skb->field examples and tests 2015-03-15 22:02:28 -04:00
sockex2_user.c samples: bpf: add skb->field examples and tests 2015-03-15 22:02:28 -04:00
sockex3_kern.c bpf: allow programs to write to certain skb fields 2015-06-07 02:01:33 -07:00
sockex3_user.c samples/bpf: bpf_tail_call example for networking 2015-05-21 17:07:59 -04:00
tcbpf1_kern.c bpf: make programs see skb->data == L2 for ingress and egress 2015-06-07 02:01:33 -07:00
test_maps.c samples: bpf: relax test_maps check 2015-01-26 17:20:40 -08:00
test_verifier.c bpf: allow programs to write to certain skb fields 2015-06-07 02:01:33 -07:00
tracex1_kern.c samples/bpf: Add simple non-portable kprobe filter example 2015-04-02 13:25:50 +02:00
tracex1_user.c samples/bpf: Add simple non-portable kprobe filter example 2015-04-02 13:25:50 +02:00
tracex2_kern.c bpf: introduce current->pid, tgid, uid, gid, comm accessors 2015-06-15 15:53:50 -07:00
tracex2_user.c bpf: introduce current->pid, tgid, uid, gid, comm accessors 2015-06-15 15:53:50 -07:00
tracex3_kern.c samples/bpf: Add IO latency analysis (iosnoop/heatmap) tool 2015-04-02 13:25:51 +02:00
tracex3_user.c samples/bpf: Add IO latency analysis (iosnoop/heatmap) tool 2015-04-02 13:25:51 +02:00
tracex4_kern.c samples/bpf: Add kmem_alloc()/free() tracker tool 2015-04-02 13:25:51 +02:00
tracex4_user.c samples/bpf: Add kmem_alloc()/free() tracker tool 2015-04-02 13:25:51 +02:00
tracex5_kern.c samples/bpf: bpf_tail_call example for tracing 2015-05-21 17:07:59 -04:00
tracex5_user.c samples/bpf: bpf_tail_call example for tracing 2015-05-21 17:07:59 -04:00