kernel_optimize_test/mm
Dave Hansen 1de14c3c5c x86-32: Fix possible incomplete TLB invalidate with PAE pagetables
This patch attempts to fix:

	https://bugzilla.kernel.org/show_bug.cgi?id=56461

The symptom is a crash and messages like this:

	chrome: Corrupted page table at address 34a03000
	*pdpt = 0000000000000000 *pde = 0000000000000000
	Bad pagetable: 000f [#1] PREEMPT SMP

Ingo guesses this got introduced by commit 611ae8e3f5 ("x86/tlb:
enable tlb flush range support for x86") since that code started to free
unused pagetables.

On x86-32 PAE kernels, that new code has the potential to free an entire
PMD page and will clear one of the four page-directory-pointer-table
(aka pgd_t entries).

The hardware aggressively "caches" these top-level entries and invlpg
does not actually affect the CPU's copy.  If we clear one we *HAVE* to
do a full TLB flush, otherwise we might continue using a freed pmd page.
(note, we do this properly on the population side in pud_populate()).

This patch tracks whenever we clear one of these entries in the 'struct
mmu_gather', and ensures that we follow up with a full tlb flush.

BTW, I disassembled and checked that:

	if (tlb->fullmm == 0)
and
	if (!tlb->fullmm && !tlb->need_flush_all)

generate essentially the same code, so there should be zero impact there
to the !PAE case.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Artem S Tashkinov <t.artem@mailcity.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-12 16:56:47 -07:00
..
backing-dev.c bdi: allow block devices to say that they require stable page writes 2013-02-21 17:22:19 -08:00
balloon_compaction.c
bootmem.c mm: Add alloc_bootmem_low_pages_nopanic() 2013-01-29 19:32:59 -08:00
bounce.c block: optionally snapshot page contents to provide stable pages during write 2013-02-21 17:22:20 -08:00
cleancache.c fs: encode_fh: return FILEID_INVALID if invalid fid_type 2013-02-26 02:46:10 -05:00
compaction.c mm: add & use zone_end_pfn() and zone_spans_pfn() 2013-02-23 17:50:20 -08:00
debug-pagealloc.c
dmapool.c
fadvise.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
failslab.c
filemap_xip.c
filemap.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
fremap.c Revert "mm: introduce VM_POPULATE flag to better deal with racy userspace programs" 2013-03-28 17:45:51 -07:00
frontswap.c
highmem.c Some nice cleanups, and even a patch my wife did as a "live" demo for 2012-12-20 08:37:05 -08:00
huge_memory.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
hugetlb_cgroup.c mm/hugetlb: create hugetlb cgroup file in hugetlb_init 2012-12-18 15:02:15 -08:00
hugetlb.c mm/hugetlb: fix total hugetlbfs pages count when using memory overcommit accouting 2013-03-22 16:41:20 -07:00
hwpoison-inject.c
init-mm.c
internal.h mm: accelerate munlock() treatment of THP pages 2013-02-27 19:10:09 -08:00
interval_tree.c
Kconfig Select VIRT_TO_BUS directly where needed 2013-03-12 11:16:40 -07:00
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
ksm.c ksm: fix m68k build: only NUMA needs pfn_to_nid 2013-03-08 15:05:34 -08:00
maccess.c
madvise.c mm: make madvise(MADV_WILLNEED) support swap file prefetch 2013-02-23 17:50:10 -08:00
Makefile
memblock.c x86, ACPI, mm: Revert movablemem_map support 2013-03-02 09:34:39 -08:00
memcontrol.c memcg: initialize kmem-cache destroying work earlier 2013-03-08 15:05:34 -08:00
memory_hotplug.c mm/hotplug: only free wait_table if it's allocated by vmalloc 2013-03-22 16:41:20 -07:00
memory-failure.c HWPOISON: change order of error_states[]'s elements 2013-02-23 17:50:22 -08:00
memory.c x86-32: Fix possible incomplete TLB invalidate with PAE pagetables 2013-04-12 16:56:47 -07:00
mempolicy.c mm/mempolicy.c: fix sp_node_init() argument ordering 2013-03-08 15:05:34 -08:00
mempool.c
migrate.c mm: remove offlining arg to migrate_pages 2013-02-23 17:50:19 -08:00
mincore.c swap: make each swap partition have one address_space 2013-02-23 17:50:17 -08:00
mlock.c Revert "mm: introduce VM_POPULATE flag to better deal with racy userspace programs" 2013-03-28 17:45:51 -07:00
mm_init.c mm: init: report on last-nid information stored in page->flags 2013-02-23 17:50:18 -08:00
mmap.c mm: prevent mmap_cache race in find_vma() 2013-04-04 11:46:28 -07:00
mmu_context.c
mmu_notifier.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
mmzone.c mm: rename page struct field helpers 2013-02-23 17:50:18 -08:00
mprotect.c mm/mprotect.c: coding-style cleanups 2012-12-18 15:02:15 -08:00
mremap.c mm/rmap: rename anon_vma_unlock() => anon_vma_unlock_write() 2013-02-23 17:50:17 -08:00
msync.c
nobootmem.c mm: Add alloc_bootmem_low_pages_nopanic() 2013-01-29 19:32:59 -08:00
nommu.c mm: prevent mmap_cache race in find_vma() 2013-04-04 11:46:28 -07:00
oom_kill.c memcg, oom: provide more precise dump info while memcg oom happening 2013-02-23 17:50:08 -08:00
page_alloc.c x86, ACPI, mm: Revert movablemem_map support 2013-03-02 09:34:39 -08:00
page_cgroup.c
page_io.c
page_isolation.c mm: fix zone_watermark_ok_safe() accounting of isolated pages 2013-01-04 16:11:46 -08:00
page-writeback.c 2 writeback fixes 2013-02-28 13:21:44 -08:00
pagewalk.c
percpu-km.c
percpu-vm.c
percpu.c
pgtable-generic.c
process_vm_access.c Fix: compat_rw_copy_check_uvector() misuse in aio, readv, writev, and security keys 2013-03-12 11:05:45 -07:00
quicklist.c
readahead.c
rmap.c mm/rmap: rename anon_vma_unlock() => anon_vma_unlock_write() 2013-02-23 17:50:17 -08:00
shmem.c fix nommu breakage in shmem.c 2013-03-01 23:50:45 -05:00
slab_common.c slab: propagate tunable values 2012-12-18 15:02:14 -08:00
slab.c taint: add explicit flag to show whether lock dep is still OK. 2013-01-21 17:17:57 +10:30
slab.h slab: propagate tunable values 2012-12-18 15:02:14 -08:00
slob.c mm: rename page struct field helpers 2013-02-23 17:50:18 -08:00
slub.c The sweeping change is to make add_taint() explicitly indicate whether to disable 2013-02-25 15:41:43 -08:00
sparse-vmemmap.c
sparse.c memory-failure: use num_poisoned_pages instead of mce_bad_pages 2013-02-23 17:50:15 -08:00
swap_state.c swap: add per-partition lock for swapfile 2013-02-23 17:50:17 -08:00
swap.c swap: make each swap partition have one address_space 2013-02-23 17:50:17 -08:00
swapfile.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
truncate.c mm: drop vmtruncate 2012-12-20 18:46:29 -05:00
util.c swap: make each swap partition have one address_space 2013-02-23 17:50:17 -08:00
vmalloc.c mm: use NUMA_NO_NODE 2013-02-23 17:50:21 -08:00
vmscan.c vmscan: change type of vm_total_pages to unsigned long 2013-02-23 17:50:22 -08:00
vmstat.c mm: add & use zone_end_pfn() and zone_spans_pfn() 2013-02-23 17:50:20 -08:00