kernel_optimize_test

History

Michal Hocko 0a0337e0d1 mm, oom: rework oom detection __alloc_pages_slowpath has traditionally relied on the direct reclaim and did_some_progress as an indicator that it makes sense to retry allocation rather than declaring OOM. shrink_zones had to rely on zone_reclaimable if shrink_zone didn't make any progress to prevent from a premature OOM killer invocation - the LRU might be full of dirty or writeback pages and direct reclaim cannot clean those up. zone_reclaimable allows to rescan the reclaimable lists several times and restart if a page is freed. This is really subtle behavior and it might lead to a livelock when a single freed page keeps allocator looping but the current task will not be able to allocate that single page. OOM killer would be more appropriate than looping without any progress for unbounded amount of time. This patch changes OOM detection logic and pulls it out from shrink_zone which is too low to be appropriate for any high level decisions such as OOM which is per zonelist property. It is __alloc_pages_slowpath which knows how many attempts have been done and what was the progress so far therefore it is more appropriate to implement this logic. The new heuristic is implemented in should_reclaim_retry helper called from __alloc_pages_slowpath. It tries to be more deterministic and easier to follow. It builds on an assumption that retrying makes sense only if the currently reclaimable memory + free pages would allow the current allocation request to succeed (as per __zone_watermark_ok) at least for one zone in the usable zonelist. This alone wouldn't be sufficient, though, because the writeback might get stuck and reclaimable pages might be pinned for a really long time or even depend on the current allocation context. Therefore there is a backoff mechanism implemented which reduces the reclaim target after each reclaim round without any progress. This means that we should eventually converge to only NR_FREE_PAGES as the target and fail on the wmark check and proceed to OOM. The backoff is simple and linear with 1/16 of the reclaimable pages for each round without any progress. We are optimistic and reset counter for successful reclaim rounds. Costly high order pages mostly preserve their semantic and those without __GFP_REPEAT fail right away while those which have the flag set will back off after the amount of reclaimable pages reaches equivalent of the requested order. The only difference is that if there was no progress during the reclaim we rely on zone watermark check. This is more logical thing to do than previous 1<<order attempts which were a result of zone_reclaimable faking the progress. [vdavydov@virtuozzo.com: check classzone_idx for shrink_zone] [hannes@cmpxchg.org: separate the heuristic into should_reclaim_retry] [rientjes@google.com: use zone_page_state_snapshot for NR_FREE_PAGES] [rientjes@google.com: shrink_zones doesn't need to return anything] Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Vladimir Davydov <vdavydov@virtuozzo.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>		2016-05-20 17:58:30 -07:00
..
kasan	mm, kasan: fix compilation for CONFIG_SLAB	2016-04-01 17:03:37 -05:00
backing-dev.c	writeback: fix the wrong congested state variable definition	2016-03-31 12:26:25 -06:00
balloon_compaction.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial	2016-03-17 21:38:27 -07:00
bootmem.c	mm: convert printk(KERN_<LEVEL> to pr_<level>	2016-03-17 15:09:34 -07:00
cleancache.c	cleancache: constify cleancache_ops structure	2016-01-27 09:09:57 -05:00
cma_debug.c	mm/cma_debug: correct size input to bitmap function	2015-07-17 16:39:54 -07:00
cma.c	mm/cma.c: suppress warning	2015-11-05 19:34:48 -08:00
cma.h	mm: cma: mark cma_bitmap_maxno() inline in header	2015-08-14 15:56:32 -07:00
compaction.c	mm, compaction: distinguish between full and partial COMPACT_COMPLETE	2016-05-20 17:58:30 -07:00
debug_page_ref.c	mm/page_ref: add tracepoint to track down page reference manipulation	2016-03-17 15:09:34 -07:00
debug.c	mm: introduce page reference manipulation functions	2016-03-17 15:09:34 -07:00
dmapool.c	mm: convert printk(KERN_<LEVEL> to pr_<level>	2016-03-17 15:09:34 -07:00
early_ioremap.c	mm/early_ioremap: use offset_in_page macro	2015-11-05 19:34:48 -08:00
fadvise.c	mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros	2016-04-04 10:41:08 -07:00
failslab.c	mm: fault-inject take over bootstrap kmem_cache check	2016-03-15 16:55:16 -07:00
filemap.c	mm: filemap: only do access activations on reads	2016-05-20 17:58:30 -07:00
frame_vector.c	mm/gup: Switch all callers of get_user_pages() to not pass tsk/mm	2016-02-16 10:11:12 +01:00
frontswap.c	frontswap: allow multiple backends	2015-06-24 17:49:45 -07:00
gup.c	Merge branch 'mm-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2016-04-14 19:31:34 -07:00
highmem.c	mm/highmem: make nr_free_highpages() handles all highmem zones by itself	2016-05-19 19:12:14 -07:00
huge_memory.c	huge mm: move_huge_pmd does not need new_vma	2016-05-19 19:12:14 -07:00
hugetlb_cgroup.c	mm: make compound_head() robust	2015-11-06 17:50:42 -08:00
hugetlb.c	mm/hugetlb: add same zone check in pfn_range_valid_gigantic()	2016-05-19 19:12:14 -07:00
hwpoison-inject.c	hwpoison: use page_cgroup_ino for filtering by memcg	2015-09-10 13:29:01 -07:00
init-mm.c
internal.h	mm, compaction: distinguish between full and partial COMPACT_COMPLETE	2016-05-20 17:58:30 -07:00
interval_tree.c
Kconfig	memory_hotplug: introduce CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE	2016-05-19 19:12:14 -07:00
Kconfig.debug	mm/page_ref: add tracepoint to track down page reference manipulation	2016-03-17 15:09:34 -07:00
kmemcheck.c	mm: convert printk(KERN_<LEVEL> to pr_<level>	2016-03-17 15:09:34 -07:00
kmemleak-test.c	mm: convert printk(KERN_<LEVEL> to pr_<level>	2016-03-17 15:09:34 -07:00
kmemleak.c	mm: coalesce split strings	2016-03-17 15:09:34 -07:00
ksm.c	ksm: fix conflict between mmput and scan_get_next_rmap_item	2016-05-12 15:52:50 -07:00
list_lru.c	mm: memcontrol: move kmem accounting code to CONFIG_MEMCG	2016-01-20 17:09:18 -08:00
maccess.c	mm/maccess.c: actually return -EFAULT from strncpy_from_unsafe	2015-11-05 19:34:48 -08:00
madvise.c	mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros	2016-04-04 10:41:08 -07:00
Makefile	mm, kasan: SLAB support	2016-03-25 16:37:42 -07:00
memblock.c	mm: coalesce split strings	2016-03-17 15:09:34 -07:00
memcontrol.c	oom, oom_reaper: try to reap tasks which skip regular OOM killer path	2016-05-19 19:12:14 -07:00
memory_hotplug.c	memory_hotplug: introduce memhp_default_state= command line parameter	2016-05-19 19:12:14 -07:00
memory-failure.c	mm/memory-failure: fix race with compound page split/merge	2016-04-28 19:34:04 -07:00
memory.c	mm: thp: calculate the mapcount correctly for THP pages during WP faults	2016-05-12 15:52:50 -07:00
mempolicy.c	mm, page_alloc: avoid looking up the first zone in a zonelist twice	2016-05-19 19:12:14 -07:00
mempool.c	mm, kasan: add GFP flags to KASAN API	2016-03-25 16:37:42 -07:00
memtest.c	memtest: remove unused header files	2015-09-08 15:35:28 -07:00
migrate.c	mm: use __SetPageSwapBacked and dont ClearPageSwapBacked	2016-05-19 19:12:14 -07:00
mincore.c	mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage	2016-04-04 10:41:08 -07:00
mlock.c	mm: fix mlock accouting	2016-01-21 17:20:51 -08:00
mm_init.c	mm: convert printk(KERN_<LEVEL> to pr_<level>	2016-03-17 15:09:34 -07:00
mmap.c	mm/mmap: kill hook arch_rebalance_pgtables()	2016-05-19 19:12:14 -07:00
mmu_context.c	mm/mmu_context, sched/core: Fix mmu_context.h assumption	2016-04-28 11:44:19 +02:00
mmu_notifier.c	fix Christoph's email addresses	2016-03-17 15:09:34 -07:00
mmzone.c	mm, page_alloc: inline the fast path of the zonelist iterator	2016-05-19 19:12:14 -07:00
mprotect.c	mm/mprotect.c: don't imply PROT_EXEC on non-exec fs	2016-03-22 15:36:02 -07:00
mremap.c	huge pagecache: extend mremap pmd rmap lockout to files	2016-05-19 19:12:14 -07:00
msync.c	mm/msync: use offset_in_page macro	2015-11-05 19:34:48 -08:00
nobootmem.c	mm: convert printk(KERN_<LEVEL> to pr_<level>	2016-03-17 15:09:34 -07:00
nommu.c	Merge branch 'mm-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2016-04-14 19:31:34 -07:00
oom_kill.c	mm, oom_reaper: clear TIF_MEMDIE for all tasks queued for oom_reaper	2016-05-19 19:12:14 -07:00
page_alloc.c	mm, oom: rework oom detection	2016-05-20 17:58:30 -07:00
page_counter.c	mm: page_counter: let page_counter_try_charge() return bool	2015-11-05 19:34:48 -08:00
page_ext.c	mm/page_poisoning.c: allow for zero poisoning	2016-03-15 16:55:16 -07:00
page_idle.c	mm: add page_check_address_transhuge() helper	2016-01-15 17:56:32 -08:00
page_io.c	Merge branch 'work.preadv2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2016-05-17 15:05:23 -07:00
page_isolation.c	mm/memory_hotplug: add comment to some functions related to memory hotplug	2016-05-19 19:12:14 -07:00
page_owner.c	mm, page_alloc: inline pageblock lookup in page free fast paths	2016-05-19 19:12:14 -07:00
page_poison.c	mm/page_poisoning.c: allow for zero poisoning	2016-03-15 16:55:16 -07:00
page-writeback.c	mm/writeback: correct dirty page calculation for highmem	2016-05-19 19:12:14 -07:00
pagewalk.c	thp: rename split_huge_page_pmd() to split_huge_pmd()	2016-01-15 17:56:32 -08:00
percpu-km.c	mm: percpu: use pr_fmt to prefix output	2016-03-17 15:09:34 -07:00
percpu-vm.c
percpu.c	mm: percpu: use pr_fmt to prefix output	2016-03-17 15:09:34 -07:00
pgtable-generic.c	mm/thp/migration: switch from flush_tlb_range to flush_pmd_tlb_range	2016-03-17 15:09:34 -07:00
process_vm_access.c	mm/gup: Introduce get_user_pages_remote()	2016-02-16 10:04:09 +01:00
quicklist.c	fix Christoph's email addresses	2016-03-17 15:09:34 -07:00
readahead.c	mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros	2016-04-04 10:41:08 -07:00
rmap.c	mm: use __SetPageSwapBacked and dont ClearPageSwapBacked	2016-05-19 19:12:14 -07:00
shmem.c	tmpfs: mem_cgroup charge fault to vm_mm not current mm	2016-05-19 19:12:14 -07:00
slab_common.c	mm, kasan: add GFP flags to KASAN API	2016-03-25 16:37:42 -07:00
slab.c	include/linux/nodemask.h: create next_node_in() helper	2016-05-19 19:12:14 -07:00
slab.h	mm, kasan: add GFP flags to KASAN API	2016-03-25 16:37:42 -07:00
slob.c	mm: slab: free kmem_cache_node after destroy sysfs file	2016-02-18 16:23:24 -08:00
slub.c	mm: rename _count, field of the struct page, to _refcount	2016-05-19 19:12:14 -07:00
sparse-vmemmap.c	mm: convert printk(KERN_<LEVEL> to pr_<level>	2016-03-17 15:09:34 -07:00
sparse.c	mm: convert printk(KERN_<LEVEL> to pr_<level>	2016-03-17 15:09:34 -07:00
swap_cgroup.c	mm: convert printk(KERN_<LEVEL> to pr_<level>	2016-03-17 15:09:34 -07:00
swap_state.c	mm: use __SetPageSwapBacked and dont ClearPageSwapBacked	2016-05-19 19:12:14 -07:00
swap.c	thp: keep huge zero page pinned until tlb flush	2016-04-28 19:34:04 -07:00
swapfile.c	mm: thp: calculate the mapcount correctly for THP pages during WP faults	2016-05-12 15:52:50 -07:00
truncate.c	mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros	2016-04-04 10:41:08 -07:00
userfaultfd.c	mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros	2016-04-04 10:41:08 -07:00
util.c	mm: uninline page_mapped()	2016-05-19 19:12:14 -07:00
vmacache.c	mm/vmacache: inline vmacache_valid_mm()	2015-11-05 19:34:48 -08:00
vmalloc.c	mm/vmalloc: use PAGE_ALIGNED() to check PAGE_SIZE alignment	2016-03-17 15:09:34 -07:00
vmpressure.c	mm/vmpressure.c: fix subtree pressure detection	2016-02-03 08:28:43 -08:00
vmscan.c	mm, oom: rework oom detection	2016-05-20 17:58:30 -07:00
vmstat.c	mm, page_alloc: inline pageblock lookup in page free fast paths	2016-05-19 19:12:14 -07:00
workingset.c	mm: workingset: make shadow node shrinker memcg aware	2016-03-17 15:09:34 -07:00
zbud.c	mm/zbud.c: use list_last_entry() instead of list_tail_entry()	2016-01-15 11:40:52 -08:00
zpool.c	mm: zsmalloc: constify struct zs_pool name	2015-11-06 17:50:42 -08:00
zsmalloc.c	zsmalloc: fix zs_can_compact() integer overflow	2016-05-09 17:40:59 -07:00
zswap.c	mm/zswap: provide unique zpool name	2016-05-05 17:38:53 -07:00