forked from luck/tmp_suning_uos_patched
1beaef29c3
For memcpy, the source pages are memset to zero only when --cycles is
used. This leads to wildly different results with or without --cycles,
since all sources pages are likely to be mapped to the same zero page
without explicit writes.
Before this fix:
$ export cmd="./perf stat -e LLC-loads -- ./perf bench \
mem memcpy -s 1024MB -l 100 -f default"
$ $cmd
2,935,826 LLC-loads
3.821677452 seconds time elapsed
$ $cmd --cycles
217,533,436 LLC-loads
8.616725985 seconds time elapsed
After this fix:
$ $cmd
214,459,686 LLC-loads
8.674301124 seconds time elapsed
$ $cmd --cycles
214,758,651 LLC-loads
8.644480006 seconds time elapsed
Fixes:
|
||
---|---|---|
.. | ||
bench.h | ||
Build | ||
epoll-ctl.c | ||
epoll-wait.c | ||
find-bit-bench.c | ||
futex-hash.c | ||
futex-lock-pi.c | ||
futex-requeue.c | ||
futex-wake-parallel.c | ||
futex-wake.c | ||
futex.h | ||
kallsyms-parse.c | ||
mem-functions.c | ||
mem-memcpy-arch.h | ||
mem-memcpy-x86-64-asm-def.h | ||
mem-memcpy-x86-64-asm.S | ||
mem-memcpy-x86-64-lib.c | ||
mem-memset-arch.h | ||
mem-memset-x86-64-asm-def.h | ||
mem-memset-x86-64-asm.S | ||
numa.c | ||
sched-messaging.c | ||
sched-pipe.c | ||
synthesize.c | ||
syscall.c |