Commit Graph

400 Commits

Author SHA1 Message Date
Arnaldo Carvalho de Melo
afb7b4f08e perf tools: Factor out the map initialization
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1256927305-4628-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-02 16:52:11 +01:00
Arnaldo Carvalho de Melo
66bd8424cc perf tools: Delay loading symtabs till we hit a map with it
So that we can have a quicker start on perf top and even
speedups in the other tools, as we can have maps with no hits,
so no need to load its symtabs.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1256773881-4191-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-29 08:23:40 +01:00
Marti Raudsepp
689d301878 perf tools: Output 'perf list' to stdout not stderr
Writing to stdout is probably the expected behavior because the
user explicitly asked for a list.

Signed-off-by: Marti Raudsepp <marti@juffo.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <4ebb59420ef057972167.1256603585@localhost>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-27 14:52:32 +01:00
Marti Raudsepp
85df6f683e perf tools: Notify user when unrecognized event is specified
Previously no indication was given about what went wrong.

Signed-off-by: Marti Raudsepp <marti@juffo.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <03ec9ee96f17cef05424.1256603584@localhost>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-27 14:52:31 +01:00
Arnaldo Carvalho de Melo
5b2bb75a0d perf top: Support userspace symbols too
Example:

Compiling the kernel with 'make -k 22 allyesconfig'

[root@emilia linux-2.6-tip]# perf top -r 90
------------------------------------------------------------------------------
   PerfTop:    3669 irqs/sec  kernel:59.9% [1000Hz cycles],  (all, 8 CPUs)
------------------------------------------------------------------------------

             samples  pcnt function                                 DSO
             _______ _____ ________________________________ ________________

             3062.00  6.5% clear_page_c                     [kernel]
             2233.00  4.8% _int_malloc                      /lib64/libc-2.5.so
             2100.00  4.5% yylex                            /home/acme/git/build/allyesconfig/scripts/genksyms/genksyms
             2029.00  4.3% memset                           /lib64/libc-2.5.so
             1224.00  2.6% page_fault                       [kernel]
             1075.00  2.3% __GI_strlen                      /lib64/libc-2.5.so
              863.00  1.8% sub_preempt_count                [kernel]
              822.00  1.8% __GI_memcpy                      /lib64/libc-2.5.so
              810.00  1.7% __GI_vfprintf                    /lib64/libc-2.5.so
              786.00  1.7% _int_free                        /lib64/libc-2.5.so
              775.00  1.7% __GI_strcmp                      /lib64/libc-2.5.so
              748.00  1.6% _spin_lock                       [kernel]
              699.00  1.5% main                             /home/acme/git/build/allyesconfig/scripts/basic/fixdep
              659.00  1.4% add_preempt_count                [kernel]
              649.00  1.4% yyparse                          /home/acme/git/build/allyesconfig/scripts/genksyms/genksyms
              645.00  1.4% preempt_trace                    [kernel]
              635.00  1.4% __GI___libc_free                 /lib64/libc-2.5.so
              597.00  1.3% trace_preempt_on                 [kernel]
              551.00  1.2% __GI___libc_malloc               /lib64/libc-2.5.so
              516.00  1.1% _spin_lock_irqsave               [kernel]
              481.00  1.0% copy_user_generic_string         [kernel]
              479.00  1.0% unmap_vmas                       [kernel]
              429.00  0.9% _IO_file_xsputn_internal         /lib64/libc-2.5.so
              425.00  0.9% __GI_strncpy                     /lib64/libc-2.5.so
              416.00  0.9% get_page_from_freelist           [kernel]
              414.00  0.9% malloc_consolidate               /lib64/libc-2.5.so
              406.00  0.9% get_parent_ip                    [kernel]
              362.00  0.8% __rmqueue                        [kernel]
              347.00  0.7% in_lock_functions                [kernel]
              316.00  0.7% __d_lookup                       [kernel]

[root@emilia linux-2.6-tip]#

More polishing is needed to print just DSO basename when not
--verbose, etc.

Supporting a 'comm' column requires some more reworking of 'perf
top' internals as we will need to use something like the hist
entries 'perf report' uses and will be done in another patch.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1256592199-9608-3-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-27 13:51:54 +01:00
Arnaldo Carvalho de Melo
234fbbf508 perf tools: Generalize event synthesizing routines
Because we will need it in 'perf top' to support userspace
symbols for existing threads.

Now we pass a callback that will receive the synthesized event
and then write it to the output file in 'perf record' and in the
upcoming patch for 'perf top' we will just immediatelly create
the in memory representation of threads and maps.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1256592199-9608-2-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-27 13:51:53 +01:00
Arnaldo Carvalho de Melo
7f3bedcc93 perf record: Fix race where process can disappear while reading its /proc/pid/tasks
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1256592199-9608-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-27 13:51:53 +01:00
Michael Cree
fcd14b3203 perf tools, Alpha: Add Alpha support to perf.h
For the perf tool the patch implements an Alpha specific section
in the perf.h header file.

Signed-off-by: Michael Cree <mcree@orcon.net.nz>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1256545926-6972-1-git-send-email-mcree@orcon.net.nz>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-26 09:45:41 +01:00
Arnaldo Carvalho de Melo
6beba7adbe perf tools: Unify debug messages mechanisms
We were using eprintf in some places, that looks at a global
'verbose' level, and at other places passing a 'v' parameter to
specify the verbosity level, unify it by introducing
pr_{err,warning,debug,etc}, just like in the kernel.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1256153646-10097-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-23 08:22:47 +02:00
Frederic Weisbecker
802da5f228 perf tools: Drop asm/types.h wrapper
Wrapping the kernel headers is dangerous when it comes to arch
headers. Once we wrap asm/types.h, it will also replace the
glibc asm/types.h, not only the kernel one.

This results in build errors on some machines.

Drop this wrapper and do its work from linux/types.h wrapper,
also the glibc asm/types.h can already handle most of the type
definition it was doing (typedef __u64, __u32, etc...).

Todo: Check the others asm/*.h wrappers to prevent from other
conflicts.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
LKML-Reference: <1256246604-17156-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-23 07:55:19 +02:00
Frederic Weisbecker
a4fb581b15 perf tools: Bind callchains to the first sort dimension column
Currently, the callchains are displayed using a constant left
margin. So depending on the current sort dimension
configuration, callchains may appear to be well attached to the
first sort dimension column field which is mostly the case,
except when the first dimension of sorting is done by comm,
because these are right aligned.

This patch binds the callchain to the first letter in the first
column, whatever type of column it is (dso, comm, symbol).
Before:

     0.80%             perf  [k] __lock_acquire
             __lock_acquire
             lock_acquire
             |
             |--58.33%-- _spin_lock
             |          |
             |          |--28.57%-- inotify_should_send_event
             |          |          fsnotify
             |          |          __fsnotify_parent

After:

     0.80%             perf  [k] __lock_acquire
                       __lock_acquire
                       lock_acquire
                       |
                       |--58.33%-- _spin_lock
                       |          |
                       |          |--28.57%-- inotify_should_send_event
                       |          |          fsnotify
                       |          |          __fsnotify_parent

Also, for clarity, we don't put anymore the callchain as is but:

- If we have a top level ancestor in the callchain, start it
  with a first ascii hook.

  Before:

     0.80%             perf  [kernel]                        [k] __lock_acquire
                       __lock_acquire
                         lock_acquire
                       |
                       |--58.33%-- _spin_lock
                       |          |
                       |          |--28.57%-- inotify_should_send_event
                       |          |          fsnotify
                      [..]       [..]

   After:

     0.80%             perf  [kernel]                         [k] __lock_acquire
                       |
                       --- __lock_acquire
                           lock_acquire
                          |
                          |--58.33%-- _spin_lock
                          |          |
                          |          |--28.57%-- inotify_should_send_event
                          |          |          fsnotify
                         [..]       [..]

- Otherwise, if we have several top level ancestors, then
  display these like we did before:

       1.69%           Xorg
                       |
                       |--21.21%-- vread_hpet
                       |          0x7fffd85b46fc
                       |          0x7fffd85b494d
                       |          0x7f4fafb4e54d
                       |
                       |--15.15%-- exaOffscreenAlloc
                       |
                       |--9.09%-- I830WaitLpRing

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
LKML-Reference: <1256246604-17156-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-23 07:55:18 +02:00
Frederic Weisbecker
af0a6fa463 perf tools: Fix missing top level callchain
While recursively printing the branches of each callchains, we
forget to display the root. It is never printed.

Say we have:

    symbol
    f1
    f2
     |
     -------- f3
     |        f4
     |
     ---------f5
              f6

Actually we never see that, instead it displays:

    symbol
    |
    --------- f3
    |         f4
    |
    --------- f5
              f6

However f1 is always the same than "symbol" and if we are
sorting by symbols first then "symbol", f1 and f2 will be well
aligned like in the above example, so displaying f1 looks
redundant here.

But if we are sorting by something else first (dso, comm,
etc...), displaying f1 doesn't look redundant but rather
necessary because the symbol is not well aligned anymore with
its callchain:

     comm     dso        symbol
     f1
     f2
     |
     --------- [...]

And we want the callchain to be obvious.
So we fix the bug by printing the root branch, but we also
filter its first entry if we are sorting by symbols first.

Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1256246604-17156-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-23 07:55:16 +02:00
Steven Rostedt
4e3b799d7d perf tools: Use strsep() over strtok_r() for parsing single line
The second argument in the strtok_r() function is not to be used
generically and can have different implementations. Currently
the function parsing of the perf trace code uses the second
argument to copy data from. This can crash the tool or just have
unpredictable results.

The correct solution is to use strsep() which has a defined
result.

I also added a check to see if the result was correct, and will
break out of the loop in case it fails to parse as expected.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091020232034.237814877@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-21 13:39:57 +02:00
Steven Rostedt
60d526f7fa perf tools: Add 'make DEBUG=1' to remove the -O6 cflag
When using gdb to debug perf, it is practically impossible to
use when perf is compiled with -O6. For developers, this patch
adds the DEBUG feature to the make command line so that a
developer can easily remove the optimization flag.

LKML-Reference: <1255590330.8392.446.camel@twins>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091020232033.984323261@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-21 13:39:57 +02:00
Arnaldo Carvalho de Melo
c88e4bf60d perf top: Fix symbol annotation
We need to use map->unmap_ip() here too to match section
relative symbol address to the absolute address needed to match
objdump -dS addresses.

Reported-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1256061295-19835-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-20 21:12:59 +02:00
Arnaldo Carvalho de Melo
8f0b037398 perf annotate: Remove requirement of passing a symbol name
If the user doesn't pass a symbol name to annotate, it will
annotate all the symbols that have hits, in order, just like
'perf report -s comm,dso,symbol'.

This is a natural followup patch to the one that uses
output_hists to find the symbols with hits.

The common case is to annotate the first few entries at the top
of a perf report, so lets type less characters.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1256058509-19678-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-20 21:12:58 +02:00
Arnaldo Carvalho de Melo
e42049926e perf annotate: Use the sym_priv_size area for the histogram
We have this sym_priv_size mechanism for attaching private areas
to struct symbol entries but annotate wasn't using it, adding
private areas to struct symbol in addition to a ->priv pointer.

Scrap all that and use the sym_priv_size mechanism.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1256055940-19511-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-20 21:12:58 +02:00
Arnaldo Carvalho de Melo
ed52ce2e3c perf tools: Add ->unmap_ip operation to struct map
We need this because we get section relative addresses when
reading the symtabs, but when a tool like 'perf annotate' needs
to match these address to what 'objdump -dS' produces we need
the address + section back again.

So in annotate now we look at the 'struct hist_entry' instances
(that weren't really being used) so that we iterate only over
the symbols that had some hit and get the map where that
particular hit happened so that we can get the right address to
match with annotate.

Verified that at least:

 perf annotate mmap_read_counter # Uses the ~/bin/perf binary
 perf annotate --vmlinux /home/acme/git/build/perf/vmlinux intel_pmu_enable_all

on a 'perf record perf top' session seems to work.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1255979877-12533-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-20 07:55:51 +02:00
Arjan van de Ven
bbe2987bea perf timechart: Add a process filter
During the Kernel Summit demo of perf/ftrace/timechart, there
was a feature request to have a process filter for timechart so
that you can zoom into one or a few processes that you are
really interested in.

This patch adds basic support for this feature, the -p
(--process) option now can select a PID or a process name to be
shown. Multiple -p options are allowed, and the combined set
will be included in the output.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091020070939.7d0fb8a7@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-20 07:55:50 +02:00
Ingo Molnar
c258449bc9 Merge branch 'perf/urgent' into perf/core
Merge reason: Queue up dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-20 07:51:44 +02:00
Arjan van de Ven
2e600d01c1 perf timechart: Improve the visual appearance of scheduler delays
[from KS feedback]

Currently, scheduler delays are shown in a mostly transparent,
light yellow color. This color is rather hard to see on several
screens, especially projectors.

This patch changes the color of the scheduler delays to be a
much more "hard" yellow that survived the kernel summit
projector.

Reported-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091020064731.20ae126a@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-20 03:39:21 +02:00
Arjan van de Ven
3bc2a39c69 perf timechart: Fix the wakeup-arrows that point to non-visible processes
The timechart wakeup arrows currently show no process
information when the waker/wakee are processes that are not
actually chosen to be shown on the timechart.

This patch fixes this oversight, by looking through all
processes (after giving preference to visible processes) as well
as falling back to just showing the PID if no name for the
process can be resolved.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091020064649.0e4959b2@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-20 03:39:16 +02:00
Arnaldo Carvalho de Melo
79b9ad361b perf tools: Add bunch of missing headers to LIB_H
Build dependencies were not properly mapped out.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1255973491-11626-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-20 03:00:36 +02:00
Arnaldo Carvalho de Melo
20639c15d2 perf tools: Add missing tools/perf/util/include/string.h
To cure a bunch of:

In file included from util/include/linux/bitmap.h:1,
                 from util/header.h:8,
                 from builtin-trace.c:7:
util/include/../../../../include/linux/bitmap.h:8:26: error:
linux/string.h: No such file or directory make: ***
[builtin-trace.o] Error 1 make: *** Waiting for unfinished
jobs....

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1255972296-11500-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-20 02:59:34 +02:00
Ingo Molnar
dd86e72abd perf stat: Count branches first
Count branches first, cache-misses second. The reason is that
on x86 branches are not counted by all counters on all CPUs.

Before:

 Performance counter stats for 'ls':

       0.756653  task-clock-msecs         #      0.802 CPUs
              0  context-switches         #      0.000 M/sec
              0  CPU-migrations           #      0.000 M/sec
            250  page-faults              #      0.330 M/sec
        2375725  cycles                   #   3139.781 M/sec
        1628129  instructions             #      0.685 IPC
          19643  cache-references         #     25.960 M/sec
           4608  cache-misses             #      6.090 M/sec
         342532  branches                 #    452.694 M/sec
  <not counted>  branch-misses

    0.000943356  seconds time elapsed

After:

 Performance counter stats for 'ls':

       1.056734  task-clock-msecs         #      0.859 CPUs
              0  context-switches         #      0.000 M/sec
              0  CPU-migrations           #      0.000 M/sec
            259  page-faults              #      0.245 M/sec
        3345932  cycles                   #   3166.295 M/sec
        3074090  instructions             #      0.919 IPC
         616928  branches                 #    583.806 M/sec
          39279  branch-misses            #      6.367 %
          21312  cache-references         #     20.168 M/sec
           3661  cache-misses             #      3.464 M/sec

    0.001230551  seconds time elapsed

(also prettify the printout of branch misses, in case it's
 getting scaled.)

Cc: Tim Blechmann <tim@klingt.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4ADC3975.8050109@klingt.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 tools/perf/builtin-stat.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index c373683..95a55ea 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -59,6 +59,8 @@ static struct perf_event_attr default_attrs[] = {
   { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS	},
   { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES},
   { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES	},
+  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS},
+  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES	},

 };
---
 tools/perf/builtin-stat.c |   20 ++++++++++----------
 1 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 95a55ea..90e0a26 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -50,17 +50,17 @@

 static struct perf_event_attr default_attrs[] = {

-  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK	},
-  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES},
-  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS	},
-  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS	},
-
-  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CPU_CYCLES	},
-  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS	},
-  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES},
-  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES	},
-  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS},
-  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES	},
+  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK		},
+  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES	},
+  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS		},
+  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS		},
+
+  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CPU_CYCLES		},
+  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS		},
+  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES	},
+  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES		},
+  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS	},
+  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES		},

 };
2009-10-19 13:36:32 +02:00
Ingo Molnar
56aab464ff perf stat: Re-align the default_attrs[] array
Clean up the array definition to be vertically aligned.

No functional effects.

Cc: Tim Blechmann <tim@klingt.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4ADC3975.8050109@klingt.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 tools/perf/builtin-stat.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index c373683..95a55ea 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -59,6 +59,8 @@ static struct perf_event_attr default_attrs[] = {
   { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS	},
   { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES},
   { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES	},
+  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS},
+  { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES	},

 };
2009-10-19 13:27:08 +02:00
Tim Blechmann
12133afffc perf stat: Add branch performance events to default output
Adds performance event information about branches
and branch misses to the default output of perf stat.

Signed-off-by: Tim Blechmann <tim@klingt.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4ADC3975.8050109@klingt.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-19 13:26:42 +02:00
Randy Dunlap
1abc7f5500 perf tools: Display better error messages on missing packages
Check for libelf headers and glibc headers separately so that
the error message correctly identifies which package
installation is missing/needed.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: paulus@samba.org
Cc: a.p.zijlstra@chello.nl
Cc: efault@gmx.de
Cc: fweisbec@gmail.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <4ADBCCE8.3060300@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-19 10:06:37 +02:00
Tim Blechmann
dc79959aaf perf top: Fix --delay_secs 0 division by zero
Add delay_secs sanity check to handle_keypress,
this fixes a division by zero crash.

Signed-off-by: Tim Blechmann <tim@klingt.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <4AD9EBFD.106@klingt.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-19 09:52:39 +02:00
Frederic Weisbecker
db9f11e36d perf tools: Use DECLARE_BITMAP instead of an open-coded array
Use DECLARE_BITMAP instead of an open coded array for our bitmap
of featured sections.

This makes the array an unsigned long instead of a u64 but since
we use a 256 bits bitmap, the array size shouldn't vary between
different boxes.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1255795038-13751-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-19 09:26:35 +02:00
Frederic Weisbecker
2ba0825075 perf tools: Introduce bitmask'ed additional headers
This provides a new set of bitmasked headers. A new field is
added in the perf headers that implements a bitmap storing
optional features present in the perf.data file.

The layout can be pictured like this:

(Usual perf headers)(Features bitmap)[Feature 0][Feature
n][Feature 255]

If the bit n is set, then the feature n is used in this file.
They are all set in order. This brings a backward and forward
compatibility.

The trace_info section has moved into such optional features,
this is the first and only one for now.

This is backward compatible with the .32 file version although
it doesn't support the previous separate trace.info file.

And finally it doesn't support the current interim development
version.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1255792354-11304-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-19 09:26:35 +02:00
Frederic Weisbecker
5a116dd279 perf tools: Use kernel bitmap library
Use the kernel bitmap library for internal perf tools uses.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1255792354-11304-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-19 09:26:34 +02:00
Anton Blanchard
11018201b8 perf stat: Add branch performance metric
When we count both branches and branch-misses it is useful to
print out the percentage of branch-misses:

 # perf stat -e branches -e branch-misses /bin/true

 Performance counter stats for '/bin/true':

         401684  branches                 #      0.000 M/sec
          23301  branch-misses            #      5.801 %

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: paulus@samba.org
Cc: a.p.zijlstra@chello.nl
LKML-Reference: <20091018112923.GQ4808@kryten>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-19 09:20:21 +02:00
Julia Lawall
f39cdf25bf perf tools: Move dereference after NULL test
In each case, if the NULL test on thread is needed, then the
dereference should be after the NULL test.

A simplified version of the semantic match that detects this
problem is as follows (http://coccinelle.lip6.fr/):

// <smpl>
@match exists@
expression x, E;
identifier fld;
@@

* x->fld
  ... when != \(x = E\|&x\)
* x == NULL
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
LKML-Reference: <Pine.LNX.4.64.0910170842500.9213@ask.diku.dk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-17 09:29:10 +02:00
Ingo Molnar
210f9cb2cf perf tools: Bump version to 0.0.2
We released the first version of perf with 0.0.1 in v2.6.31,
time to double our version number to 0.0.2 ;-)

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16 10:34:28 +02:00
Li Zefan
c171b552a7 perf trace: Add filter Suppport
Add a new option "--filter <filter_str>" to perf record, and
it should be right after "-e trace_point":

 #./perf record -R -f -e irq:irq_handler_entry --filter irq==18
 ^C
 # ./perf trace
            perf-4303  ... irq_handler_entry: irq=18 handler=eth0
            init-0     ... irq_handler_entry: irq=18 handler=eth0
            init-0     ... irq_handler_entry: irq=18 handler=eth0
            init-0     ... irq_handler_entry: irq=18 handler=eth0
            init-0     ... irq_handler_entry: irq=18 handler=eth0

See Documentation/trace/events.txt for the syntax of filter
expressions.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <4AD6955F.90602@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 11:35:23 +02:00
Steven Rostedt
c4dc775f53 perf tools: Remove all char * typecasts and use const in prototype
The (char *) for all the static strings was a fix for the
symptom and not the disease. The real issue was that the
function prototypes needed to be declared "const char *".

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194400.635935008@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:43:17 +02:00
Steven Rostedt
afdf1a404e perf tools: Handle - and + in parsing trace print format
The opterators '-' and '+' are not handled in the trace print
format.

To do: '++' and '--'.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194400.330843045@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:40 +02:00
Steven Rostedt
cda48461c7 perf tools: Add latency format to trace output
Add the irqs disabled, preemption count, need resched, and other
info that is shown in the latency format of ftrace.

 # perf trace -l
    perf-16457   2..s2. 53636.260344: kmem_cache_free: call_site=ffffffff811198f
    perf-16457   2..s2. 53636.264330: kmem_cache_free: call_site=ffffffff811198f
    perf-16457   2d.s4. 53636.300006: kmem_cache_free: call_site=ffffffff810d889

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194400.076588953@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:39 +02:00
Steven Rostedt
0d1da915c7 perf tools: Handle both versions of ftrace output
The ftrace output events can have either arguments or no
arguments. The parser needs to be able to handle both.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194359.790221427@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:39 +02:00
Steven Rostedt
ffa1895561 perf tools: Fix bprintk reading in trace output
The bprintk parsing was broken in more ways than one.

The file parsing was incorrect, and the words used by the
arguments are always 4 bytes aligned, even on 64-bit machines.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194359.520931637@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:38 +02:00
Steven Rostedt
07a4bdddcf perf tools: Still continue on failed parsing of an event
Even though an event may fail to parse, we should not kill the
entire report. The trace should still be able to show what it
can.

If an event fails to parse, a warning is printed, and the output
continues.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194359.190809589@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:38 +02:00
Steven Rostedt
13999e5934 perf tools: Handle the case with and without the "signed" trace field
The trace format files now have a "signed" field. But we should
still be able to handle the kernels that do not have this field.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194358.888239553@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:37 +02:00
Steven Rostedt
f1d1feecf0 perf tools: Handle newlines in trace parsing better
New lines between args in the trace format can break the
parsing. This should not be the case.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194358.637991808@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:37 +02:00
Steven Rostedt
b99af87482 perf tools: Handle * as typecast in trace parsing
The '*' is currently only treated as a multiplication, and it
needs to be handled as a typecast pointer.

This is the version used by trace-cmd.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194358.409327875@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:36 +02:00
Steven Rostedt
0959b8d65c perf tools: Handle arrays in print fields for trace parsing
The array used by the ftrace stack events (caller[x]) causes
issues with the parser. This adds code to handle the case, but
it also assumes that the array is of type long.

Note, this is a special case used (currently) only by the ftrace
user and kernel stack records.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194358.124833639@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:36 +02:00
Steven Rostedt
298ebc3ef2 perf tools: Handle trace parsing of < and >
The code to handle the '<' and '>' ops was all in place, but
they were not in the switch statement to consider them as valid
ops.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194357.807434040@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:35 +02:00
Steven Rostedt
91ff2bc191 perf tools: Fix backslash processing on trace print formats
The handling of backslashes was broken. It would stop parsing
when encountering one. Also, '\n', '\t', '\r' and '\\' were not
converted.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194357.521974680@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:35 +02:00
Steven Rostedt
924a79af2c perf tools: Handle print concatenations in event format file
kmem_alloc ftrace event format had a string that was broken up
by two tokens. "string 1" "string 2". This patch lets the parser
be able to handle the concatenation.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194357.253818714@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:34 +02:00
Ingo Molnar
b226f744d4 Merge branch 'linus' into perf/core
Merge reason: pick up tools/perf/ changes from upstream.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 08:44:44 +02:00