kernel_optimize_test/arch/s390/mm
Gerald Schaefer 5f490a520b s390/mm: fix dynamic pagetable upgrade for hugetlbfs
Commit ee71d16d22 ("s390/mm: make TASK_SIZE independent from the number
of page table levels") changed the logic of TASK_SIZE and also removed the
arch_mmap_check() implementation for s390. This combination has a subtle
effect on how get_unmapped_area() for hugetlbfs pages works. It is now
possible that a user process establishes a hugetlbfs mapping at an address
above 4 TB, without triggering a dynamic pagetable upgrade from 3 to 4
levels.

This is because hugetlbfs mappings will not use mm->get_unmapped_area, but
rather file->f_op->get_unmapped_area, which currently is the generic
implementation of hugetlb_get_unmapped_area() that does not know about s390
dynamic pagetable upgrades, but with the new definition of TASK_SIZE, it
will now allow mappings above 4 TB.

Subsequent access to such a mapped address above 4 TB will result in a page
fault loop, because the CPU cannot translate such a large address with 3
pagetable levels. The fault handler will try to map in a hugepage at the
address, but due to the folded pagetable logic it will end up with creating
entries in the 3 level pagetable, possibly overwriting existing mappings,
and then it all repeats when the access is retried.

Apart from the page fault loop, this can have various nasty effects, e.g.
kernel panic from one of the BUG_ON() checks in memory management code,
or even data loss if an existing mapping gets overwritten.

Fix this by implementing HAVE_ARCH_HUGETLB_UNMAPPED_AREA support for s390,
providing an s390 version for hugetlb_get_unmapped_area() with pagetable
upgrade support similar to arch_get_unmapped_area(), which will then be
used instead of the generic version.

Fixes: ee71d16d22 ("s390/mm: make TASK_SIZE independent from the number of page table levels")
Cc: <stable@vger.kernel.org> # 4.12+
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-01-30 13:07:54 +01:00
..
cmm.c s390/cmm: fix information leak in cmm_timeout_handler() 2019-10-31 17:26:48 +01:00
dump_pagetables.c s390/mm: fix dump_pagetables top level page table walking 2019-08-06 13:58:34 +02:00
extmem.c s390/extmem: use refcount_t for refcount 2019-08-21 12:41:43 +02:00
fault.c s390/mm: add fallthrough annotations 2019-07-29 18:05:03 +02:00
gmap.c hmm related patches for 5.4 2019-09-21 10:07:42 -07:00
hugetlbpage.c s390/mm: fix dynamic pagetable upgrade for hugetlbfs 2020-01-30 13:07:54 +01:00
init.c mm/memory_hotplug: shrink zones when offlining memory 2020-01-04 13:55:08 -08:00
kasan_init.c s390/kasan: add KASAN_VMALLOC support 2019-12-11 19:56:59 +01:00
maccess.c s390: disable preemption when switching to nodat stack with CALL_ON_STACK 2019-11-30 10:52:45 +01:00
Makefile s390/mm: convert to the generic get_user_pages_fast code 2019-04-23 16:30:04 +02:00
mmap.c s390/mm: mmap base does not depend on ADDR_NO_RANDOMIZE personality 2019-06-04 15:03:53 +02:00
page-states.c s390/cmma: reuse kstrtobool for option value parsing 2019-08-26 12:51:18 +02:00
pageattr.c s390/mm: Clear huge page storage keys on enable_skey 2018-07-30 11:20:18 +01:00
pgalloc.c mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
pgtable.c s390/mm: silence compiler warning when compiling without CONFIG_PGSTE 2019-04-10 17:48:28 +02:00
vmem.c s390/kernel: introduce .dma sections 2019-04-29 10:47:10 +02:00