kernel_optimize_test/arch/arm/include/asm
Will Deacon 398aa66827 ARM: 6212/1: atomic ops: add memory constraints to inline asm
Currently, the 32-bit and 64-bit atomic operations on ARM do not
include memory constraints in the inline assembly blocks. In the
case of barrier-less operations [for example, atomic_add], this
means that the compiler may constant fold values which have actually
been modified by a call to an atomic operation.

This issue can be observed in the atomic64_test routine in
<kernel root>/lib/atomic64_test.c:

00000000 <test_atomic64>:
   0:	e1a0c00d 	mov	ip, sp
   4:	e92dd830 	push	{r4, r5, fp, ip, lr, pc}
   8:	e24cb004 	sub	fp, ip, #4
   c:	e24dd008 	sub	sp, sp, #8
  10:	e24b3014 	sub	r3, fp, #20
  14:	e30d000d 	movw	r0, #53261	; 0xd00d
  18:	e3011337 	movw	r1, #4919	; 0x1337
  1c:	e34c0001 	movt	r0, #49153	; 0xc001
  20:	e34a1aa3 	movt	r1, #43683	; 0xaaa3
  24:	e16300f8 	strd	r0, [r3, #-8]!
  28:	e30c0afe 	movw	r0, #51966	; 0xcafe
  2c:	e30b1eef 	movw	r1, #48879	; 0xbeef
  30:	e34d0eaf 	movt	r0, #57007	; 0xdeaf
  34:	e34d1ead 	movt	r1, #57005	; 0xdead
  38:	e1b34f9f 	ldrexd	r4, [r3]
  3c:	e1a34f90 	strexd	r4, r0, [r3]
  40:	e3340000 	teq	r4, #0
  44:	1afffffb 	bne	38 <test_atomic64+0x38>
  48:	e59f0004 	ldr	r0, [pc, #4]	; 54 <test_atomic64+0x54>
  4c:	e3a0101e 	mov	r1, #30
  50:	ebfffffe 	bl	0 <__bug>
  54:	00000000 	.word	0x00000000

The atomic64_set (0x38-0x44) writes to the atomic64_t, but the
compiler doesn't see this, assumes the test condition is always
false and generates an unconditional branch to __bug. The rest of the
test is optimised away.

This patch adds suitable memory constraints to the atomic operations on ARM
to ensure that the compiler is informed of the correct data hazards. We have
to use the "Qo" constraints to avoid hitting the GCC anomaly described at
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44492 , where the compiler
makes assumptions about the writeback in the addressing mode used by the
inline assembly. These constraints forbid the use of auto{inc,dec} addressing
modes, so it doesn't matter if we don't use the operand exactly once.

Cc: stable@kernel.org
Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-07-09 11:29:35 +01:00
..
hardware Merge branches 'at91', 'bcmring', 'ep93xx', 'iop', 'misc', 'nomadik', 'omap', 'pxa', 'spear' and 'versatile' into devel 2010-05-17 11:53:39 +01:00
mach [ARM] pxa: fix incorrect gpio type in udc_pxa2xx.h 2010-06-13 23:55:12 +08:00
a.out-core.h
a.out.h headers_check fix: arm, a.out.h 2009-02-01 11:01:22 +05:30
asm-offsets.h kbuild: move asm-offsets.h to include/generated 2009-12-12 13:08:14 +01:00
assembler.h ARM: fix build error in arch/arm/kernel/process.c 2010-04-21 08:45:21 +01:00
atomic.h ARM: 6212/1: atomic ops: add memory constraints to inline asm 2010-07-09 11:29:35 +01:00
auxvec.h
bitops.h ARM: boolean bit testing 2009-10-11 16:25:06 +01:00
bitsperlong.h asm-generic: introduce asm/bitsperlong.h 2009-06-11 21:02:14 +02:00
bug.h [ARM] 5211/2: fix a couple warnings from BUG() usage 2008-09-01 12:06:27 +01:00
bugs.h
byteorder.h byteorder: make swab.h include asm/swab.h like a regular header 2009-01-14 19:56:50 -08:00
cache.h ARM: 5700/1: ARM: Introduce ARM_L1_CACHE_SHIFT to define cache line size 2009-09-15 22:06:38 +01:00
cacheflush.h ARM: 6112/1: Use the Inner Shareable I-cache and BTB ops on ARMv7 SMP 2010-05-08 10:44:30 +01:00
cachetype.h [ARM] Introduce new bitmask based cache type macros 2008-09-25 15:35:28 +01:00
checksum.h
clkdev.h ARM: 6001/1: removing compilation warning comming from clkdev.h 2010-03-29 17:33:32 +01:00
cpu-multi32.h
cpu-single.h
cpu.h ARM: 5872/1: ARM: include needed linux/cpu.h in asm/cpu.h 2010-01-10 13:03:52 +00:00
cputime.h
cputype.h Merge branch 'devel' of git://git.kernel.org/pub/scm/linux/kernel/git/ycmiao/pxa-linux-2.6 into devel 2009-09-21 16:02:30 +01:00
current.h
delay.h
device.h Driver Core: Add platform device arch data V3 2009-07-22 00:28:38 +02:00
div64.h [ARM] 5320/1: fix assembly constraints in implementation of do_div() 2008-10-23 12:53:32 +01:00
dma-mapping.h dma-mapping: arm: use generic pci_set_dma_mask and pci_set_consistent_dma_mask 2010-03-12 15:52:42 -08:00
dma.h ARM: 5870/1: arch/arm: Fix build failure for defconfigs without CONFIG_ISA_DMA_API set 2010-01-10 00:08:03 +00:00
domain.h
ecard.h
elf.h arch/arm/include/asm/elf.h: forward-declare the task-struct 2010-05-01 11:33:00 +01:00
emergency-restart.h
entry-macro-vic2.S ARM: Add common entry code for system with two VICs 2010-01-15 17:10:14 +09:00
errno.h
fb.h
fcntl.h
fiq.h
fixmap.h [ARM] fixmap support 2009-03-15 21:01:20 -04:00
flat.h flat: fix data sections alignment 2009-05-29 08:40:02 -07:00
floppy.h [ARM] Move include/asm-arm/arch-* to arch/arm/*/include/mach 2008-08-07 09:55:48 +01:00
fpstate.h
ftrace.h Merge branch 'devel-stable' into devel 2009-09-12 12:02:26 +01:00
futex.h ARM: fix build error in arch/arm/kernel/process.c 2010-04-21 08:45:21 +01:00
glue.h ARM: 5727/1: Pass IFSR register to do_PrefetchAbort() 2009-10-02 22:34:32 +01:00
gpio.h [ARM] Move include/asm-arm/arch-* to arch/arm/*/include/mach 2008-08-07 09:55:48 +01:00
hardirq.h ARM: 6138/1: Add support for 10 hardirq bits 2010-05-20 23:51:07 +01:00
highmem.h ARM: 6007/1: fix highmem with VIPT cache and DMA 2010-04-14 11:11:27 +01:00
hw_irq.h
hwcap.h [ARM] 5388/1: Add hwcap bits for VFPv3 and VFPv3D16 2009-02-12 10:59:44 +00:00
ide.h
io.h ARM: Add caller information to ioremap 2010-02-15 21:39:11 +00:00
ioctl.h
ioctls.h ARM: 6092/1: atmel_serial: support for RS485 communications 2010-05-04 16:59:11 +01:00
ipcbuf.h
irq_regs.h
irq.h ARM: 6000/1: removing compilation warning comming from <asm/irq.h> 2010-03-29 17:33:31 +01:00
irqflags.h
Kbuild byteorder: make swab.h include asm/swab.h like a regular header 2009-01-14 19:56:50 -08:00
kdebug.h
kexec.h kexec jump: rename KEXEC_CONTROL_CODE_SIZE to KEXEC_CONTROL_PAGE_SIZE 2008-08-15 08:35:42 -07:00
kgdb.h
kmap_types.h kdb: core for kgdb back end (2 of 2) 2010-05-20 21:04:21 -05:00
kprobes.h [ARM] 5206/1: remove kprobe_trap_handler() hack 2008-09-01 12:06:26 +01:00
leds.h
limits.h
linkage.h
local.h
localtimer.h [ARM] smp: allow re-use of realview localtimer TWD support 2009-05-17 19:16:41 +01:00
locks.h
mach-types.h arm: move mach-types to include/generated 2009-12-12 13:08:14 +01:00
mc146818rtc.h [ARM] Convert asm/io.h to linux/io.h 2008-09-06 12:10:45 +01:00
memory.h ARM: 5928/1: Change type of VMALLOC_END to unsigned long. 2010-02-15 21:40:33 +00:00
mman.h arm: add arch_mmap_check(), get rid of sys_arm_mremap() 2009-12-11 06:34:09 -05:00
mmu_context.h ARM: 5905/1: ARM: Global ASID allocation on SMP 2010-02-15 21:39:51 +00:00
mmu.h ARM: 5905/1: ARM: Global ASID allocation on SMP 2010-02-15 21:39:51 +00:00
mmzone.h [ARM] Move include/asm-arm/arch-* to arch/arm/*/include/mach 2008-08-07 09:55:48 +01:00
module.h [ARM] 5384/1: unwind: Add stack unwinding support for loadable modules 2009-02-19 11:27:19 +00:00
msgbuf.h
mtd-xip.h [ARM] move asm/xip.h's mach/hardware.h include to mach/xip.h 2008-12-14 13:22:51 +00:00
mutex.h
nwflash.h
outercache.h ARM: 5994/1: ARM: Add outer_cache_fns.sync function pointer (2/4) 2010-03-25 21:13:49 +00:00
page-nommu.h nommu: Remove the memory_start/end variables from ARM page-nommu.h 2009-07-24 12:35:01 +01:00
page.h ARM: Pass VMA to copy_user_highpage() implementations 2009-10-05 15:17:45 +01:00
param.h
parport.h
pci.h ARM: 6058/1: Add support for PCI domains 2010-04-22 21:38:11 +01:00
percpu.h
perf_event.h ARM: 6071/1: perf-events: allow modules to query the number of hardware counters 2010-05-17 11:53:58 +01:00
pgalloc.h ARM: implement highpte 2009-08-17 20:02:06 +01:00
pgtable-hwdef.h
pgtable-nommu.h ARM: 5988/1: pgprot_dmacoherent() for non-mmu builds 2010-03-13 10:48:22 +00:00
pgtable.h ARM: Optionally allow ARMv6 to use 'normal, bufferable' memory for DMA 2010-05-17 11:52:11 +01:00
pmu.h ARM: 6064/1: pmu: register IRQs at runtime 2010-05-17 11:53:57 +01:00
poll.h
posix_types.h
proc-fns.h ARM: Kill CONFIG_CPU_32 2009-12-18 16:07:53 +00:00
processor.h ARM: 6194/1: change definition of cpu_relax() for ARM11MPCore 2010-07-01 10:13:52 +01:00
procinfo.h
ptrace.h arm: use generic ptrace_resume code 2010-03-12 15:52:38 -08:00
resource.h
scatterlist.h asm-generic: remove ARCH_HAS_SG_CHAIN in scatterlist.h 2010-05-27 09:12:54 -07:00
sections.h
segment.h
sembuf.h
serial.h
setup.h ARM: 5880/1: arm: use generic infrastructure for early params 2010-02-15 21:39:13 +00:00
shmbuf.h
shmparam.h
sigcontext.h
siginfo.h
signal.h asm-generic: rename termios.h, signal.h and mman.h 2009-06-11 21:01:52 +02:00
sizes.h [ARM] Kirkwood: create a mapping for the Security Accelerator SRAM 2009-06-08 13:05:02 -04:00
smp_plat.h ARM: Fix ptrace accesses 2009-12-14 14:54:28 +00:00
smp_scu.h [ARM] smp: separate SCU support code from realview 2009-05-17 19:00:37 +01:00
smp_twd.h ARM: 6125/1: ARM TWD: move TWD registers to common header 2010-05-12 11:18:13 +01:00
smp.h ARM: rename mach_cpu_disable() to platform_cpu_disable() 2010-05-15 15:03:51 +01:00
socket.h net: Generalize socket rx gap / receive queue overflow cmsg 2009-10-12 13:26:31 -07:00
sockios.h
sparsemem.h [ARM] mm: enable sparsemem on clps7500 and RiscPC 2008-10-01 17:24:04 +01:00
spinlock_types.h locking: Convert raw_rwlock to arch_rwlock 2009-12-14 23:55:32 +01:00
spinlock.h ARM: 5897/1: spinlock: don't use deprecated barriers on ARMv7 2010-02-15 21:39:50 +00:00
stacktrace.h [ARM] 5382/1: unwind: Reorganise the stacktrace support 2009-02-12 13:21:17 +00:00
stat.h
statfs.h ARM: Use <asm-generic/statfs.h> 2008-09-04 09:46:11 +01:00
string.h [ARM] remove memzero() 2008-11-27 12:37:59 +00:00
swab.h ARM: 5772/1: Use REV and REV16 for byte swapping on ARMv6+ 2009-10-25 15:59:53 +00:00
system.h Merge branch 'devel-stable' into devel 2010-05-17 17:24:04 +01:00
tcm.h ARM: 5580/2: ARM TCM (Tightly-Coupled Memory) support v3 2009-09-15 22:11:05 +01:00
termbits.h
termios.h
therm.h
thread_info.h add descriptive comment for TIF_MEMDIE task flag declaration. 2010-05-14 11:13:27 +02:00
thread_notify.h ARM: Convert VFP/Crunch/XscaleCP thread_release() to exit_thread() 2009-12-18 14:53:41 +00:00
timex.h [ARM] Move include/asm-arm/arch-* to arch/arm/*/include/mach 2008-08-07 09:55:48 +01:00
tlb.h mm: Pass virtual address to [__]p{te,ud,md}_free_tlb() 2009-07-27 12:10:38 -07:00
tlbflush.h ARM: 6112/1: Use the Inner Shareable I-cache and BTB ops on ARMv7 SMP 2010-05-08 10:44:30 +01:00
topology.h
traps.h [ARM] 5381/1: unwind: Reorganise the traps.c code 2009-02-12 13:21:15 +00:00
types.h
uaccess.h ARM: fix build error in arch/arm/kernel/process.c 2010-04-21 08:45:21 +01:00
ucontext.h ARM: 6051/1: VFP: preserve the HW context when calling signal handlers 2010-04-14 11:11:30 +01:00
unaligned.h
unified.h Fix "W" macro in arch/arm/include/asm/unified.h 2009-09-18 23:30:11 +01:00
unistd.h Add generic sys_ipc wrapper 2010-03-12 15:52:32 -08:00
unwind.h [ARM] 5383/2: unwind: Add core support for ARM stack unwinding 2009-02-19 11:26:24 +00:00
user.h ARM: 6051/1: VFP: preserve the HW context when calling signal handlers 2010-04-14 11:11:30 +01:00
vfp.h
vfpmacros.h
vga.h [ARM] Convert asm/io.h to linux/io.h 2008-09-06 12:10:45 +01:00
xor.h