kernel_optimize_test/arch/sparc/lib
David S. Miller 9f825962ef sparc64: Niagara-4 bzero/memset, plus use MRU stores in page copy.
This adds optimized memset/bzero/page-clear routines for Niagara-4.

We basically can do what powerpc has been able to do for a decade (via
the "dcbz" instruction), which is use cache line clearing stores for
bzero and memsets with a 'c' argument of zero.

As long as we make the cache initializing store to each 32-byte
subblock of the L2 cache line, it works.

As with other Niagara-4 optimized routines, the key is to make sure to
avoid any usage of the %asi register, as reads and writes to it cost
at least 50 cycles.

For the user clear cases, we don't use these new routines, we use the
Niagara-1 variants instead.  Those have to use %asi in an unavoidable
way.

A Niagara-4 8K page clear costs just under 600 cycles.

Add definitions of the MRU variants of the cache initializing store
ASIs.  By default, cache initializing stores install the line as Least
Recently Used.  If we know we're going to use the data immediately
(which is true for page copies and clears) we can use the Most
Recently Used variant, to decrease the likelyhood of the lines being
evicted before they get used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-10-05 13:45:26 -07:00
..
ashldi3.S sparc: Convert some assembler over to linakge.h's ENTRY/ENDPROC 2012-05-11 20:33:22 -07:00
ashrdi3.S sparc: Convert some assembler over to linakge.h's ENTRY/ENDPROC 2012-05-11 20:33:22 -07:00
atomic_64.S sparc: Convert some assembler over to linakge.h's ENTRY/ENDPROC 2012-05-11 20:33:22 -07:00
atomic32.c sparc: Fix __atomic_add_unless() return value. 2011-08-04 02:47:40 -07:00
bitext.c sparc: use bitmap_set() 2011-02-08 22:52:53 -08:00
bitops.S sparc: Convert some assembler over to linakge.h's ENTRY/ENDPROC 2012-05-11 20:33:22 -07:00
blockops.S sparc: Convert some assembler over to linakge.h's ENTRY/ENDPROC 2012-05-11 20:33:22 -07:00
bzero.S sparc: Convert some assembler over to linakge.h's ENTRY/ENDPROC 2012-05-11 20:33:22 -07:00
checksum_32.S sparc32: Fixed unaligned memory copying in function __csum_partial_copy_sparc_generic 2011-05-11 21:35:04 -07:00
checksum_64.S
clear_page.S
cmpdi2.c
copy_in_user.S
copy_page.S sparc64: Consistently use fsrc2 rather than fmovd in optimized asm. 2012-06-27 01:25:23 -07:00
copy_user.S
COPYING.LIB
csum_copy_from_user.S
csum_copy_to_user.S
csum_copy.S
divdi3.S sparc32: Kill off software 32-bit multiply/divide routines. 2012-05-15 11:23:47 -07:00
ffs.S sparc: Use popc when possible for ffs/__ffs/ffz. 2011-08-02 21:28:53 -07:00
GENbzero.S
GENcopy_from_user.S
GENcopy_to_user.S
GENmemcpy.S
GENpage.S
GENpatch.S
hweight.S sparc: Use popc if possible for hweight routines. 2011-08-02 21:28:50 -07:00
iomap.c sparc: switch to GENERIC_PCI_IOMAP 2011-12-04 15:59:49 +02:00
ipcsum.S sparc: Convert some assembler over to linakge.h's ENTRY/ENDPROC 2012-05-11 20:33:22 -07:00
ksyms.c sparc64: Add SHA1 driver making use of the 'sha1' instruction. 2012-08-20 15:08:49 -07:00
libgcc.h
locks.S
lshrdi3.S sparc: Convert some assembler over to linakge.h's ENTRY/ENDPROC 2012-05-11 20:33:22 -07:00
Makefile sparc64: Niagara-4 bzero/memset, plus use MRU stores in page copy. 2012-10-05 13:45:26 -07:00
mcount.S sparc64: Allocate sufficient stack space in ftrace stubs. 2010-04-13 18:59:02 -07:00
memcmp.S
memcpy.S sparc32: Correct the return value of memcpy. 2011-10-20 15:17:23 -07:00
memmove.S sparc: Convert some assembler over to linakge.h's ENTRY/ENDPROC 2012-05-11 20:33:22 -07:00
memscan_32.S
memscan_64.S
memset.S sparc: Stop trying to be so fancy and use __builtin_{memcpy,memset}() 2009-12-10 23:32:10 -08:00
muldi3.S sparc32: Kill off software 32-bit multiply/divide routines. 2012-05-15 11:23:47 -07:00
NG2copy_from_user.S
NG2copy_to_user.S
NG2memcpy.S sparc64: Fix return value of Niagara-2 memcpy. 2012-09-27 01:06:43 -07:00
NG2patch.S
NG4clear_page.S sparc64: Niagara-4 bzero/memset, plus use MRU stores in page copy. 2012-10-05 13:45:26 -07:00
NG4copy_from_user.S sparc64: Fix comment type in NG4 copy from user. 2012-09-27 14:26:41 -07:00
NG4copy_page.S sparc64: Niagara-4 bzero/memset, plus use MRU stores in page copy. 2012-10-05 13:45:26 -07:00
NG4copy_to_user.S sparc64: Add SPARC-T4 optimized memcpy. 2012-09-27 00:35:11 -07:00
NG4memcpy.S sparc64: Fix trailing whitespace in NG4 memcpy. 2012-09-28 13:08:22 -07:00
NG4memset.S sparc64: Niagara-4 bzero/memset, plus use MRU stores in page copy. 2012-10-05 13:45:26 -07:00
NG4patch.S sparc64: Niagara-4 bzero/memset, plus use MRU stores in page copy. 2012-10-05 13:45:26 -07:00
NGbzero.S
NGcopy_from_user.S
NGcopy_to_user.S
NGmemcpy.S
NGpage.S sparc64: Add SPARC-T4 optimized memcpy. 2012-09-27 00:35:11 -07:00
NGpatch.S
PeeCeeI.c
strlen.S
strncmp_32.S sparc: Convert some assembler over to linakge.h's ENTRY/ENDPROC 2012-05-11 20:33:22 -07:00
strncmp_64.S sparc: Convert some assembler over to linakge.h's ENTRY/ENDPROC 2012-05-11 20:33:22 -07:00
U1copy_from_user.S
U1copy_to_user.S
U1memcpy.S sparc64: Consistently use fsrc2 rather than fmovd in optimized asm. 2012-06-27 01:25:23 -07:00
U3copy_from_user.S
U3copy_to_user.S
U3memcpy.S
U3patch.S
ucmpdi2.c sparc32: add ucmpdi2 2012-05-19 15:23:57 -07:00
udivdi3.S sparc32: Kill off software 32-bit multiply/divide routines. 2012-05-15 11:23:47 -07:00
user_fixup.c
usercopy.c lib: Sparc's strncpy_from_user is generic enough, move under lib/ 2012-05-24 13:12:28 -07:00
VISsave.S
xor.S sparc: Convert some assembler over to linakge.h's ENTRY/ENDPROC 2012-05-11 20:33:22 -07:00