kernel_optimize_test/mm
Dmitry Adamushko bdb2192851 slub: Fix use-after-preempt of per-CPU data structure
Vegard Nossum reported a crash in kmem_cache_alloc():

	BUG: unable to handle kernel paging request at da87d000
	IP: [<c01991c7>] kmem_cache_alloc+0xc7/0xe0
	*pde = 28180163 *pte = 1a87d160
	Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC
	Pid: 3850, comm: grep Not tainted (2.6.26-rc9-00059-gb190333 #5)
	EIP: 0060:[<c01991c7>] EFLAGS: 00210203 CPU: 0
	EIP is at kmem_cache_alloc+0xc7/0xe0
	EAX: 00000000 EBX: da87c100 ECX: 1adad71a EDX: 6b6b6b6b
	ESI: 00200282 EDI: da87d000 EBP: f60bfe74 ESP: f60bfe54
	DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068

and analyzed it:

  "The register %ecx looks innocent but is very important here. The disassembly:

       mov    %edx,%ecx
       shr    $0x2,%ecx
       rep stos %eax,%es:(%edi) <-- the fault

   So %ecx has been loaded from %edx... which is 0x6b6b6b6b/POISON_FREE.
   (0x6b6b6b6b >> 2 == 0x1adadada.)

   %ecx is the counter for the memset, from here:

       memset(object, 0, c->objsize);

  i.e. %ecx was loaded from c->objsize, so "c" must have been freed.
  Where did "c" come from? Uh-oh...

       c = get_cpu_slab(s, smp_processor_id());

  This looks like it has very much to do with CPU hotplug/unplug. Is
  there a race between SLUB/hotplug since the CPU slab is used after it
  has been freed?"

Good analysis.

Yeah, it's possible that a caller of kmem_cache_alloc() -> slab_alloc()
can be migrated on another CPU right after local_irq_restore() and
before memset().  The inital cpu can become offline in the mean time (or
a migration is a consequence of the CPU going offline) so its
'kmem_cache_cpu' structure gets freed ( slab_cpuup_callback).

At some point of time the caller continues on another CPU having an
obsolete pointer...

Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Reported-by: Vegard Nossum <vegard.nossum@gmail.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-10 15:18:50 -07:00
..
allocpercpu.c Christoph has moved 2008-07-04 10:40:04 -07:00
backing-dev.c mm: bdi: fix race in bdi_class device creation 2008-05-20 13:31:53 -07:00
bootmem.c Add return value to reserve_bootmem_node() 2008-06-21 11:25:10 -07:00
bounce.c
dmapool.c dmapool: enable debugging for CONFIG_SLUB_DEBUG_ON too 2008-04-28 08:58:20 -07:00
fadvise.c xip: support non-struct page backed memory 2008-04-28 08:58:23 -07:00
filemap_xip.c xip: support non-struct page backed memory 2008-04-28 08:58:23 -07:00
filemap.c mm: fix infinite loop in filemap_fault 2008-05-14 19:11:13 -07:00
fremap.c
highmem.c mm: highmem kernel-doc additions 2008-03-19 18:53:35 -07:00
hugetlb.c hugetlb: fix lockdep error 2008-06-06 11:29:09 -07:00
internal.h memory hotplug: free memmaps allocated by bootmem 2008-04-28 08:58:26 -07:00
Kconfig PAGEFLAGS_EXTENDED and separate page flags for Head and Tail 2008-04-28 08:58:22 -07:00
maccess.c kgdb: fix optional arch functions and probe_kernel_* 2008-04-17 20:05:39 +02:00
madvise.c xip: support non-struct page backed memory 2008-04-28 08:58:23 -07:00
Makefile uaccess: add probe_kernel_write() 2008-04-17 20:05:36 +02:00
memcontrol.c memcg: simple stats for memory resource controller 2008-05-01 08:04:02 -07:00
memory_hotplug.c memory_hotplug: always initialize pageblock bitmap 2008-05-14 19:11:15 -07:00
memory.c get_user_pages(): fix possible page leak on oom 2008-07-04 10:40:04 -07:00
mempolicy.c mempolicy: mask off internal flags for userspace API 2008-07-04 13:03:05 -07:00
mempool.c
migrate.c Christoph has moved 2008-07-04 10:40:04 -07:00
mincore.c mm: remove nopage 2008-04-28 08:58:18 -07:00
mlock.c
mmap.c brk: make sys_brk() honor COMPAT_BRK when computing lower bound 2008-06-06 11:29:09 -07:00
mmzone.c mm: filter based on a nodemask as well as a gfp_mask 2008-04-28 08:58:19 -07:00
mprotect.c mprotect: prevent alteration of the PAT bits 2008-05-14 19:11:15 -07:00
mremap.c
msync.c
nommu.c nommu: Correct kobjsize() page validity checks. 2008-06-12 07:56:17 -07:00
oom_kill.c oom_kill: remove unused parameter in badness() 2008-04-28 08:58:26 -07:00
page_alloc.c Do not overwrite nr_zones on !NUMA when initialising zlcache_ptr 2008-07-03 09:22:59 -07:00
page_io.c
page_isolation.c
page-writeback.c mm: Add NR_WRITEBACK_TEMP counter 2008-04-30 08:29:50 -07:00
pagewalk.c pagemap: pass mm into pagewalkers 2008-06-12 18:05:41 -07:00
pdflush.c mm/pdflush.c: merge the same code in two path 2008-05-13 08:02:24 -07:00
prio_tree.c
quicklist.c
readahead.c mm: bdi: export BDI attributes in sysfs 2008-04-30 08:29:49 -07:00
rmap.c mm: remove nopage 2008-04-28 08:58:18 -07:00
shmem_acl.c
shmem.c mm: bdi: add separate writeback accounting capability 2008-04-30 08:29:50 -07:00
slab.c Slab: Fix memory leak in fallback_alloc() 2008-06-21 16:51:02 -07:00
slob.c slob: Fix to return wrong pointer 2008-05-19 20:55:25 +03:00
slub.c slub: Fix use-after-preempt of per-CPU data structure 2008-07-10 15:18:50 -07:00
sparse-vmemmap.c Christoph has moved 2008-07-04 10:40:04 -07:00
sparse.c revert "memory hotplug: allocate usemap on the section with pgdat" 2008-04-30 08:29:55 -07:00
swap_state.c mm: bdi: add separate writeback accounting capability 2008-04-30 08:29:50 -07:00
swap.c mm: fix atomic_t overflow in vm 2008-05-24 09:56:09 -07:00
swapfile.c mm: use non-racy method for /proc/swaps creation 2008-04-29 08:06:20 -07:00
thrash.c
tiny-shmem.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2008-03-25 08:57:47 -07:00
truncate.c fix invalidate_inode_pages2_range() to not clear ret 2008-04-28 08:58:18 -07:00
util.c
vmalloc.c docbook: fix vmalloc missing parameter notation 2008-05-01 08:03:59 -07:00
vmscan.c mm: fix incorrect variable type in do_try_to_free_pages() 2008-06-12 18:05:39 -07:00
vmstat.c make vmstat cpu-unplug safe 2008-05-13 08:02:23 -07:00