kernel_optimize_test/mm
Jesper Dangaard Brouer 994eb764ec slub bulk alloc: extract objects from the per cpu slab
First piece: acceleration of retrieval of per cpu objects

If we are allocating lots of objects then it is advantageous to disable
interrupts and avoid the this_cpu_cmpxchg() operation to get these objects
faster.

Note that we cannot do the fast operation if debugging is enabled, because
we would have to add extra code to do all the debugging checks.  And it
would not be fast anyway.

Note also that the requirement of having interrupts disabled avoids having
to do processor flag operations.

Allocate as many objects as possible in the fast way and then fall back to
the generic implementation for the rest of the objects.

Measurements on CPU CPU i7-4790K @ 4.00GHz
Baseline normal fastpath (alloc+free cost): 42 cycles(tsc) 10.554 ns

Bulk- fallback                   - this-patch
  1 -  57 cycles(tsc) 14.432 ns  -  48 cycles(tsc) 12.155 ns  improved 15.8%
  2 -  50 cycles(tsc) 12.746 ns  -  37 cycles(tsc)  9.390 ns  improved 26.0%
  3 -  48 cycles(tsc) 12.180 ns  -  33 cycles(tsc)  8.417 ns  improved 31.2%
  4 -  48 cycles(tsc) 12.015 ns  -  32 cycles(tsc)  8.045 ns  improved 33.3%
  8 -  46 cycles(tsc) 11.526 ns  -  30 cycles(tsc)  7.699 ns  improved 34.8%
 16 -  45 cycles(tsc) 11.418 ns  -  32 cycles(tsc)  8.205 ns  improved 28.9%
 30 -  80 cycles(tsc) 20.246 ns  -  73 cycles(tsc) 18.328 ns  improved  8.8%
 32 -  79 cycles(tsc) 19.946 ns  -  72 cycles(tsc) 18.208 ns  improved  8.9%
 34 -  78 cycles(tsc) 19.659 ns  -  71 cycles(tsc) 17.987 ns  improved  9.0%
 48 -  86 cycles(tsc) 21.516 ns  -  82 cycles(tsc) 20.566 ns  improved  4.7%
 64 -  93 cycles(tsc) 23.423 ns  -  89 cycles(tsc) 22.480 ns  improved  4.3%
128 - 100 cycles(tsc) 25.170 ns  -  99 cycles(tsc) 24.871 ns  improved  1.0%
158 - 102 cycles(tsc) 25.549 ns  - 101 cycles(tsc) 25.375 ns  improved  1.0%
250 - 101 cycles(tsc) 25.344 ns  - 100 cycles(tsc) 25.182 ns  improved  1.0%

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
..
kasan x86/kasan, mm: Introduce generic kasan_populate_zero_shadow() 2015-08-22 14:54:55 +02:00
backing-dev.c writeback: don't drain bdi_writeback_congested on bdi destruction 2015-07-02 08:46:00 -06:00
balloon_compaction.c
bootmem.c mm: page_alloc: pass PFN to __free_pages_bootmem 2015-06-30 19:44:55 -07:00
cleancache.c
cma_debug.c mm/cma_debug: correct size input to bitmap function 2015-07-17 16:39:54 -07:00
cma.c
cma.h mm: cma: mark cma_bitmap_maxno() inline in header 2015-08-14 15:56:32 -07:00
compaction.c
debug-pagealloc.c
debug.c
dmapool.c
early_ioremap.c
fadvise.c
failslab.c
filemap.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-07-04 19:36:06 -07:00
frontswap.c
gup.c
highmem.c
huge_memory.c mm: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_* 2015-08-07 04:39:42 +03:00
hugetlb_cgroup.c
hugetlb.c mm/hugetlb: remove unused arch hook prepare/release_hugepage 2015-06-25 17:00:35 -07:00
hwpoison-inject.c
init-mm.c
internal.h mm: meminit: finish initialisation of struct pages before basic setup 2015-06-30 19:44:56 -07:00
interval_tree.c
Kconfig mm/Kconfig: NEED_BOUNCE_POOL: clean-up condition 2015-07-23 20:59:41 +02:00
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c
ksm.c
list_lru.c
maccess.c lib: move strncpy_from_unsafe() into mm/maccess.c 2015-08-31 12:36:10 -07:00
madvise.c
Makefile
memblock.c mm: page_alloc: pass PFN to __free_pages_bootmem 2015-06-30 19:44:55 -07:00
memcontrol.c Merge branch 'for-4.2/writeback' of git://git.kernel.dk/linux-block 2015-06-25 16:00:17 -07:00
memory_hotplug.c memory-hotplug: add hot-added memory ranges to memblock before allocate node_data for a node. 2015-09-04 16:54:41 -07:00
memory-failure.c mm/hwpoison: fix panic due to split huge zero page 2015-08-14 15:56:32 -07:00
memory.c mm: avoid setting up anonymous pages into file mapping 2015-07-09 11:12:48 -07:00
mempolicy.c
mempool.c
memtest.c
migrate.c mm/memory-failure: set PageHWPoison before migrate_pages() 2015-08-07 04:39:42 +03:00
mincore.c
mlock.c
mm_init.c mm: meminit: remove mminit_verify_page_links 2015-06-30 19:44:56 -07:00
mmap.c vfs: Commit to never having exectuables on proc and sysfs. 2015-07-10 10:39:25 -05:00
mmu_context.c
mmu_notifier.c
mmzone.c
mprotect.c
mremap.c
msync.c
nobootmem.c mm: page_alloc: pass PFN to __free_pages_bootmem 2015-06-30 19:44:55 -07:00
nommu.c Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2015-09-01 18:46:42 -07:00
oom_kill.c
page_alloc.c mm: make page pfmemalloc check more robust 2015-08-21 14:30:10 -07:00
page_counter.c
page_ext.c
page_io.c fs: use helper bio_add_page() instead of open coding on bi_io_vec 2015-08-13 12:32:00 -06:00
page_isolation.c
page_owner.c mm/page_owner: set correct gfp_mask on page_owner 2015-07-17 16:39:54 -07:00
page-writeback.c writeback: fix initial dirty limit 2015-08-07 04:39:42 +03:00
pagewalk.c
percpu-km.c
percpu-vm.c
percpu.c percpu: clean up of schunk->map[] assignment in pcpu_setup_first_chunk 2015-07-21 11:31:00 -04:00
pgtable-generic.c
process_vm_access.c
quicklist.c
readahead.c
rmap.c Merge branch 'for-4.2/writeback' of git://git.kernel.dk/linux-block 2015-06-25 16:00:17 -07:00
shmem.c ipc: use private shmem or hugetlbfs inodes for shm segments. 2015-08-07 04:39:41 +03:00
slab_common.c slab: infrastructure for bulk object allocation and freeing 2015-09-04 16:54:41 -07:00
slab.c slab: infrastructure for bulk object allocation and freeing 2015-09-04 16:54:41 -07:00
slab.h slab: infrastructure for bulk object allocation and freeing 2015-09-04 16:54:41 -07:00
slob.c slab: infrastructure for bulk object allocation and freeing 2015-09-04 16:54:41 -07:00
slub.c slub bulk alloc: extract objects from the per cpu slab 2015-09-04 16:54:41 -07:00
sparse-vmemmap.c
sparse.c
swap_cgroup.c
swap_state.c
swap.c
swapfile.c
truncate.c
util.c
vmacache.c
vmalloc.c
vmpressure.c
vmscan.c mm, vmscan: Do not wait for page writeback for GFP_NOFS allocations 2015-08-05 10:49:38 +02:00
vmstat.c
workingset.c
zbud.c zpool: remove zpool_evict() 2015-06-25 17:00:37 -07:00
zpool.c zpool: remove zpool_evict() 2015-06-25 17:00:37 -07:00
zsmalloc.c zpool: remove zpool_evict() 2015-06-25 17:00:37 -07:00
zswap.c zswap: runtime enable/disable 2015-06-25 17:00:37 -07:00