kernel_optimize_test/tools/perf/Documentation/perf-mem.txt
Ravi Bangoria f0fabf9c89 perf mem/c2c: Fix perf_mem_events to support powerpc
PowerPC hardware does not have a builtin latency filter (--ldlat) for
the "mem-load" event and perf_mem_events by default includes
"/ldlat=30/" which is causing a failure on PowerPC. Refactor the code to
support "perf mem/c2c" on PowerPC.

This patch depends on kernel side changes done my Madhavan:
https://lists.ozlabs.org/pipermail/linuxppc-dev/2018-December/182596.html

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Dick Fowles <fowles@inreach.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20190129132412.771-1-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-04 11:32:14 -03:00

93 lines
2.2 KiB
Plaintext

perf-mem(1)
===========
NAME
----
perf-mem - Profile memory accesses
SYNOPSIS
--------
[verse]
'perf mem' [<options>] (record [<command>] | report)
DESCRIPTION
-----------
"perf mem record" runs a command and gathers memory operation data
from it, into perf.data. Perf record options are accepted and are passed through.
"perf mem report" displays the result. It invokes perf report with the
right set of options to display a memory access profile. By default, loads
and stores are sampled. Use the -t option to limit to loads or stores.
Note that on Intel systems the memory latency reported is the use-latency,
not the pure load (or store latency). Use latency includes any pipeline
queueing delays in addition to the memory subsystem latency.
OPTIONS
-------
<command>...::
Any command you can specify in a shell.
-i::
--input=<file>::
Input file name.
-f::
--force::
Don't do ownership validation
-t::
--type=<type>::
Select the memory operation type: load or store (default: load,store)
-D::
--dump-raw-samples::
Dump the raw decoded samples on the screen in a format that is easy to parse with
one sample per line.
-x::
--field-separator=<separator>::
Specify the field separator used when dump raw samples (-D option). By default,
The separator is the space character.
-C::
--cpu=<cpu>::
Monitor only on the list of CPUs provided. Multiple CPUs can be provided as a
comma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2. Default
is to monitor all CPUS.
-U::
--hide-unresolved::
Only display entries resolved to a symbol.
-p::
--phys-data::
Record/Report sample physical addresses
RECORD OPTIONS
--------------
-e::
--event <event>::
Event selector. Use 'perf mem record -e list' to list available events.
-K::
--all-kernel::
Configure all used events to run in kernel space.
-U::
--all-user::
Configure all used events to run in user space.
-v::
--verbose::
Be more verbose (show counter open errors, etc)
--ldlat <n>::
Specify desired latency for loads event. (x86 only)
In addition, for report all perf report options are valid, and for record
all perf record options.
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-report[1]