kernel_optimize_test/mm
Konstantin Khlebnikov d6d86c0a7f mm/balloon_compaction: redesign ballooned pages management
Sasha Levin reported KASAN splash inside isolate_migratepages_range().
Problem is in the function __is_movable_balloon_page() which tests
AS_BALLOON_MAP in page->mapping->flags.  This function has no protection
against anonymous pages.  As result it tried to check address space flags
inside struct anon_vma.

Further investigation shows more problems in current implementation:

* Special branch in __unmap_and_move() never works:
  balloon_page_movable() checks page flags and page_count.  In
  __unmap_and_move() page is locked, reference counter is elevated, thus
  balloon_page_movable() always fails.  As a result execution goes to the
  normal migration path.  virtballoon_migratepage() returns
  MIGRATEPAGE_BALLOON_SUCCESS instead of MIGRATEPAGE_SUCCESS,
  move_to_new_page() thinks this is an error code and assigns
  newpage->mapping to NULL.  Newly migrated page lose connectivity with
  balloon an all ability for further migration.

* lru_lock erroneously required in isolate_migratepages_range() for
  isolation ballooned page.  This function releases lru_lock periodically,
  this makes migration mostly impossible for some pages.

* balloon_page_dequeue have a tight race with balloon_page_isolate:
  balloon_page_isolate could be executed in parallel with dequeue between
  picking page from list and locking page_lock.  Race is rare because they
  use trylock_page() for locking.

This patch fixes all of them.

Instead of fake mapping with special flag this patch uses special state of
page->_mapcount: PAGE_BALLOON_MAPCOUNT_VALUE = -256.  Buddy allocator uses
PAGE_BUDDY_MAPCOUNT_VALUE = -128 for similar purpose.  Storing mark
directly in struct page makes everything safer and easier.

PagePrivate is used to mark pages present in page list (i.e.  not
isolated, like PageLRU for normal pages).  It replaces special rules for
reference counter and makes balloon migration similar to migration of
normal pages.  This flag is protected by page_lock together with link to
the balloon device.

Signed-off-by: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Link: http://lkml.kernel.org/p/53E6CEAA.9020105@oracle.com
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: <stable@vger.kernel.org>	[3.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:26:01 -04:00
..
backing-dev.c mm: clean up zone flags 2014-10-09 22:25:57 -04:00
balloon_compaction.c mm/balloon_compaction: redesign ballooned pages management 2014-10-09 22:26:01 -04:00
bootmem.c mm/bootmem.c: use include/linux/ headers 2014-10-09 22:26:00 -04:00
cleancache.c
cma.c mm: cma: adjust address limit to avoid hitting low/high memory boundary 2014-10-09 22:25:53 -04:00
compaction.c mm/balloon_compaction: redesign ballooned pages management 2014-10-09 22:26:01 -04:00
debug-pagealloc.c
debug.c mm/debug.c: use pr_emerg() 2014-10-09 22:25:59 -04:00
dmapool.c mm/dmapool.c: fixed a brace coding style issue 2014-10-09 22:26:00 -04:00
early_ioremap.c
fadvise.c
failslab.c
filemap_xip.c
filemap.c mm/filemap.c: remove trailing whitespace 2014-10-09 22:26:00 -04:00
fremap.c
frontswap.c
gup.c mm: introduce a general RCU get_user_pages_fast() 2014-10-09 22:26:00 -04:00
highmem.c
huge_memory.c mm: use VM_BUG_ON_MM where possible 2014-10-09 22:25:58 -04:00
hugetlb_cgroup.c
hugetlb.c mm: convert a few VM_BUG_ON callers to VM_BUG_ON_VMA 2014-10-09 22:25:57 -04:00
hwpoison-inject.c
init-mm.c
internal.h mm, compaction: pass gfp mask to compact_control 2014-10-09 22:25:55 -04:00
interval_tree.c mm: convert a few VM_BUG_ON callers to VM_BUG_ON_VMA 2014-10-09 22:25:57 -04:00
iov_iter.c fuse: honour max_read and max_write in direct_io mode 2014-09-26 21:16:51 -04:00
Kconfig mm: introduce a general RCU get_user_pages_fast() 2014-10-09 22:26:00 -04:00
Kconfig.debug
kmemcheck.c mm/slab_common: move kmem_cache definition to internal header 2014-10-09 22:25:50 -04:00
kmemleak-test.c
kmemleak.c
ksm.c mm: ksm use pr_err instead of printk 2014-10-09 22:26:00 -04:00
list_lru.c
maccess.c
madvise.c
Makefile mm: move debug code out of page_alloc.c 2014-10-09 22:25:58 -04:00
memblock.c
memcontrol.c memcg: zap memcg_can_account_kmem 2014-10-09 22:26:00 -04:00
memory_hotplug.c memory-hotplug: add sysfs valid_zones attribute 2014-10-09 22:25:52 -04:00
memory-failure.c
memory.c mm: softdirty: keep bit when zapping file pte 2014-09-26 08:10:35 -07:00
mempolicy.c mempolicy: unexport get_vma_policy() and remove its "task" arg 2014-10-09 22:25:56 -04:00
mempool.c
migrate.c mm/balloon_compaction: redesign ballooned pages management 2014-10-09 22:26:01 -04:00
mincore.c
mlock.c mm: use VM_BUG_ON_MM where possible 2014-10-09 22:25:58 -04:00
mm_init.c
mmap.c mm: use VM_BUG_ON_MM where possible 2014-10-09 22:25:58 -04:00
mmu_context.c
mmu_notifier.c
mmzone.c
mprotect.c
mremap.c mm/mremap.c: use linux headers 2014-10-09 22:26:00 -04:00
msync.c
nobootmem.c
nommu.c
oom_kill.c mm: clean up zone flags 2014-10-09 22:25:57 -04:00
page_alloc.c mm: move debug code out of page_alloc.c 2014-10-09 22:25:58 -04:00
page_cgroup.c
page_io.c
page_isolation.c
page-writeback.c mm/page-writeback.c: use min3/max3 macros to avoid shadow warnings 2014-10-09 22:25:57 -04:00
pagewalk.c mm: use VM_BUG_ON_MM where possible 2014-10-09 22:25:58 -04:00
percpu-km.c
percpu-vm.c
percpu.c
pgtable-generic.c
process_vm_access.c
quicklist.c
readahead.c
rmap.c mm: convert a few VM_BUG_ON callers to VM_BUG_ON_VMA 2014-10-09 22:25:57 -04:00
shmem.c include/linux/migrate.h: remove migrate_page #define 2014-10-09 22:25:56 -04:00
slab_common.c memcg: move memcg_update_cache_size() to slab_common.c 2014-10-09 22:25:59 -04:00
slab.c mm/slab.c: use __seq_open_private() instead of seq_open() 2014-10-09 22:25:57 -04:00
slab.h mm/slab: use percpu allocator for cpu cache 2014-10-09 22:25:51 -04:00
slob.c mm/sl[ao]b: always track caller in kmalloc_(node_)track_caller() 2014-10-09 22:25:50 -04:00
slub.c mm/slab_common: commonize slab merge logic 2014-10-09 22:25:51 -04:00
sparse-vmemmap.c
sparse.c
swap_state.c mm: memcontrol: do not kill uncharge batching in free_pages_and_swap_cache 2014-10-09 22:25:59 -04:00
swap.c mm: memcontrol: do not kill uncharge batching in free_pages_and_swap_cache 2014-10-09 22:25:59 -04:00
swapfile.c
truncate.c
util.c proc/maps: make vm_is_stack() logic namespace-friendly 2014-10-09 22:25:50 -04:00
vmacache.c
vmalloc.c mm/vmalloc.c: use seq_open_private() instead of seq_open() 2014-10-09 22:25:56 -04:00
vmpressure.c
vmscan.c mm: memcontrol: fix transparent huge page allocations under pressure 2014-10-09 22:25:59 -04:00
vmstat.c
workingset.c
zbud.c
zpool.c
zsmalloc.c
zswap.c