kernel_optimize_test/lib
Ross Zwisler 67a3e8fe90 nd_blk: change aperture mapping from WC to WB
This should result in a pretty sizeable performance gain for reads.  For
rough comparison I did some simple read testing using PMEM to compare
reads of write combining (WC) mappings vs write-back (WB).  This was
done on a random lab machine.

PMEM reads from a write combining mapping:
	# dd of=/dev/null if=/dev/pmem0 bs=4096 count=100000
	100000+0 records in
	100000+0 records out
	409600000 bytes (410 MB) copied, 9.2855 s, 44.1 MB/s

PMEM reads from a write-back mapping:
	# dd of=/dev/null if=/dev/pmem0 bs=4096 count=1000000
	1000000+0 records in
	1000000+0 records out
	4096000000 bytes (4.1 GB) copied, 3.44034 s, 1.2 GB/s

To be able to safely support a write-back aperture I needed to add
support for the "read flush" _DSM flag, as outlined in the DSM spec:

http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf

This flag tells the ND BLK driver that it needs to flush the cache lines
associated with the aperture after the aperture is moved but before any
new data is read.  This ensures that any stale cache lines from the
previous contents of the aperture will be discarded from the processor
cache, and the new data will be read properly from the DIMM.  We know
that the cache lines are clean and will be discarded without any
writeback because either a) the previous aperture operation was a read,
and we never modified the contents of the aperture, or b) the previous
aperture operation was a write and we must have written back the dirtied
contents of the aperture to the DIMM before the I/O was completed.

In order to add support for the "read flush" flag I needed to add a
generic routine to invalidate cache lines, mmio_flush_range().  This is
protected by the ARCH_HAS_MMIO_FLUSH Kconfig variable, and is currently
only supported on x86.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-27 19:38:28 -04:00
..
842 lib: correct 842 decompress for 32 bit 2015-05-13 10:31:59 +08:00
fonts
lz4 lz4: fix system halt at boot kernel on x86_64 2015-05-24 11:56:29 -07:00
lzo
mpi Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2015-06-22 21:04:48 -07:00
raid6 powerpc updates for 4.2 2015-06-24 08:46:32 -07:00
reed_solomon
xz
zlib_deflate
zlib_inflate
.gitignore
argv_split.c
asn1_decoder.c
assoc_array.c
atomic64_test.c
atomic64.c
audit.c
average.c
bcd.c
bch.c
bitmap.c bitmap: remove explicit newline handling using scnprintf format string 2015-06-25 17:00:40 -07:00
bitrev.c
bsearch.c
btree.c
bug.c module: Sanitize RCU usage and locking 2015-05-28 11:31:52 +09:30
build_OID_registry
bust_spinlocks.c
check_signature.c
checksum.c
clz_ctz.c
clz_tab.c
cmdline.c
compat_audit.c
cordic.c
cpu_rmap.c sched/topology: Rename topology_thread_cpumask() to topology_sibling_cpumask() 2015-05-27 15:22:15 +02:00
cpu-notifier-error-inject.c
cpumask.c revert "cpumask: don't perform while loop in cpumask_next_and()" 2015-06-18 17:00:23 -10:00
crc-ccitt.c
crc-itu-t.c lib: crc-itu-t.[ch] fix 0x0x prefix in integer constants 2015-05-26 15:26:43 +02:00
crc-t10dif.c lib: introduce crc_t10dif_update() 2015-05-30 22:42:24 -07:00
crc7.c
crc8.c
crc16.c
crc32.c
crc32defs.h
ctype.c
debug_info.c kbuild: include core debug info when DEBUG_INFO_REDUCED 2015-06-11 15:08:32 +02:00
debug_locks.c
debugobjects.c
dec_and_lock.c
decompress_bunzip2.c
decompress_inflate.c
decompress_unlz4.c
decompress_unlzma.c
decompress_unlzo.c
decompress_unxz.c
decompress.c lib/decompress: set the compressor name to NULL on error 2015-07-17 16:39:54 -07:00
devres.c cleanup IORESOURCE_CACHEABLE vs ioremap() 2015-08-10 23:07:06 -04:00
digsig.c
div64.c
dma-debug.c dma-debug: skip debug_dma_assert_idle() when disabled 2015-07-17 16:39:53 -07:00
dump_stack.c
dynamic_debug.c module: add extra argument for parse_params() callback 2015-05-20 00:25:24 -07:00
dynamic_queue_limits.c
earlycpio.c
extable.c
fault-inject.c
fdt_empty_tree.c
fdt_ro.c
fdt_rw.c
fdt_strerror.c
fdt_sw.c
fdt_wip.c
fdt.c
find_bit.c lib: rename lib/find_next_bit.c to lib/find_bit.c 2015-04-17 09:03:54 -04:00
flex_array.c
flex_proportions.c
gcd.c
gen_crc32table.c
genalloc.c genalloc: rename of_get_named_gen_pool() to of_gen_pool_get() 2015-06-30 19:45:01 -07:00
glob.c
halfmd4.c
hexdump.c hexdump: fix for non-aligned buffers 2015-07-17 16:39:53 -07:00
hweight.c
idr.c
inflate.c
int_sqrt.c
interval_tree_test.c
interval_tree.c
iomap_copy.c
iomap.c
iommu-common.c iommu-common: rename iommu_pool_hash to iommu_hash_common 2015-04-20 14:09:55 -04:00
iommu-helper.c
ioremap.c
iov_iter.c
irq_regs.c
is_single_threaded.c
jedec_ddr_data.c
kasprintf.c
Kconfig nd_blk: change aperture mapping from WC to WB 2015-08-27 19:38:28 -04:00
Kconfig.debug sched/stat: Simplify the sched_info accounting dependency 2015-07-04 10:04:30 +02:00
Kconfig.kasan x86/kasan: Move KASAN_SHADOW_OFFSET to the arch Kconfig 2015-07-06 14:53:15 +02:00
Kconfig.kgdb
Kconfig.kmemcheck
kfifo.c
klist.c
kobject_uevent.c
kobject.c include, lib: add __printf attributes to several function prototypes 2015-07-17 16:39:53 -07:00
kstrtox.c
kstrtox.h
lcm.c
libcrc32c.c
list_debug.c
list_sort.c lib/list_sort: use late_initcall to hook in self tests 2015-06-16 14:12:35 -04:00
llist.c
locking-selftest-hardirq.h
locking-selftest-mutex.h
locking-selftest-rlock-hardirq.h
locking-selftest-rlock-softirq.h
locking-selftest-rlock.h
locking-selftest-rsem.h
locking-selftest-softirq.h
locking-selftest-spin-hardirq.h
locking-selftest-spin-softirq.h
locking-selftest-spin.h
locking-selftest-wlock-hardirq.h
locking-selftest-wlock-softirq.h
locking-selftest-wlock.h
locking-selftest-wsem.h
locking-selftest.c
lockref.c
lru_cache.c lru_cache: remove use of seq_printf return value 2015-04-15 16:35:25 -07:00
Makefile Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild 2015-07-02 14:58:12 -07:00
md5.c
memory-notifier-error-inject.c
memweight.c
net_utils.c
nlattr.c
notifier-error-inject.c
notifier-error-inject.h
of-reconfig-notifier-error-inject.c
oid_registry.c
parser.c
pci_iomap.c cleanup IORESOURCE_CACHEABLE vs ioremap() 2015-08-10 23:07:06 -04:00
percpu_counter.c percpu_counter: batch size aware __percpu_counter_compare() 2015-05-29 07:39:34 +10:00
percpu_ida.c
percpu_test.c
percpu-refcount.c
plist.c
pm-notifier-error-inject.c
proportions.c
radix-tree.c radix-tree: replace preallocated node array with linked list 2015-06-25 17:00:40 -07:00
random32.c
ratelimit.c
rational.c
rbtree_test.c
rbtree.c rbtree: Make lockless searches non-fatal 2015-05-28 11:32:04 +09:30
reciprocal_div.c
rhashtable.c rhashtable: fix for resize events during table walk 2015-07-08 14:53:49 -07:00
scatterlist.c drivers/scsi/scsi_debug.c: resolve sg buffer const-ness issue 2015-06-30 19:44:59 -07:00
seq_buf.c
sha1.c
show_mem.c
smp_processor_id.c
sort.c lib/sort: Add 64 bit swap function 2015-06-25 17:00:40 -07:00
stmp_device.c
string_helpers.c SCSI misc on 20150416 2015-04-16 19:02:04 -04:00
string.c lib/string.c: introduce strreplace() 2015-06-25 17:00:40 -07:00
strncpy_from_user.c
strnlen_user.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-06-22 15:52:04 -07:00
swiotlb.c Merge branch 'for-4.2/sg' of git://git.kernel.dk/linux-block 2015-06-25 15:22:36 -07:00
syscall.c
test_bpf.c test_bpf: add similarly conflicting jump test case only for classic 2015-05-27 14:05:59 -04:00
test_firmware.c
test_kasan.c
test_module.c
test_rhashtable.c rhashtable-test: Fix 64bit division 2015-05-05 19:30:47 -04:00
test_user_copy.c
test-hexdump.c hexdump: Make test data really const 2015-06-25 17:00:40 -07:00
test-kstrtox.c
test-string_helpers.c lib/string_helpers.c: change semantics of string_escape_mem 2015-04-15 16:35:24 -07:00
textsearch.c
timerqueue.c timerqueue: Let timerqueue_add/del return information 2015-04-22 17:06:49 +02:00
ts_bm.c
ts_fsm.c
ts_kmp.c
ucs2_string.c
usercopy.c
uuid.c
vsprintf.c lib/vsprintf.c: improve put_dec_trunc8 slightly 2015-04-17 09:03:55 -04:00