kernel_optimize_test/mm
David Rientjes 2ff05b2b4e oom: move oom_adj value from task_struct to mm_struct
The per-task oom_adj value is a characteristic of its mm more than the
task itself since it's not possible to oom kill any thread that shares the
mm.  If a task were to be killed while attached to an mm that could not be
freed because another thread were set to OOM_DISABLE, it would have
needlessly been terminated since there is no potential for future memory
freeing.

This patch moves oomkilladj (now more appropriately named oom_adj) from
struct task_struct to struct mm_struct.  This requires task_lock() on a
task to check its oom_adj value to protect against exec, but it's already
necessary to take the lock when dereferencing the mm to find the total VM
size for the badness heuristic.

This fixes a livelock if the oom killer chooses a task and another thread
sharing the same memory has an oom_adj value of OOM_DISABLE.  This occurs
because oom_kill_task() repeatedly returns 1 and refuses to kill the
chosen task while select_bad_process() will repeatedly choose the same
task during the next retry.

Taking task_lock() in select_bad_process() to check for OOM_DISABLE and in
oom_kill_task() to check for threads sharing the same memory will be
removed in the next patch in this series where it will no longer be
necessary.

Writing to /proc/pid/oom_adj for a kthread will now return -EINVAL since
these threads are immune from oom killing already.  They simply report an
oom_adj value of OOM_DISABLE.

Cc: Nick Piggin <npiggin@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-16 19:47:43 -07:00
..
allocpercpu.c percpu: __percpu_depopulate_mask can take a const mask 2009-04-06 13:44:15 -07:00
backing-dev.c block: change the request allocation/congestion logic to be sync/async based 2009-04-06 08:04:53 -07:00
bootmem.c bootmem: fix slab fallback on numa 2009-06-11 19:15:54 +03:00
bounce.c Merge branch 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block 2009-06-11 11:10:35 -07:00
debug-pagealloc.c generic debug pagealloc 2009-04-01 08:59:13 -07:00
dmapool.c
fadvise.c readahead: move max_sane_readahead() calls into force_page_cache_readahead() 2009-06-16 19:47:28 -07:00
failslab.c kmemtrace, mm: fix slab.h dependency problem in mm/failslab.c 2009-04-03 12:23:01 +02:00
filemap_xip.c mm: do_xip_mapping_read: fix length calculation 2009-04-02 19:04:49 -07:00
filemap.c page allocator: do not check NUMA node ID when the caller knows the node is valid 2009-06-16 19:47:32 -07:00
fremap.c Do not account for the address space used by hugetlbfs using VM_ACCOUNT 2009-02-10 10:48:42 -08:00
highmem.c mm: introduce debug_kmap_atomic 2009-04-01 08:59:14 -07:00
hugetlb.c mm: introduce PageHuge() for testing huge/gigantic pages 2009-06-16 19:47:36 -07:00
init-mm.c mm: consolidate init_mm definition 2009-06-16 19:47:28 -07:00
internal.h mm: remove CONFIG_UNEVICTABLE_LRU config option 2009-06-16 19:47:42 -07:00
Kconfig mm: remove CONFIG_UNEVICTABLE_LRU config option 2009-06-16 19:47:42 -07:00
Kconfig.debug generic debug pagealloc: build fix 2009-04-02 19:04:48 -07:00
kmemleak-test.c kmemleak: Simple testing module for kmemleak 2009-06-11 17:04:19 +01:00
kmemleak.c kmemleak: Add the base support 2009-06-11 17:03:28 +01:00
maccess.c [S390] maccess: add weak attribute to probe_kernel_write 2009-06-12 10:27:37 +02:00
madvise.c mm: madvise(): correct return code 2009-06-16 19:47:40 -07:00
Makefile mm: consolidate init_mm definition 2009-06-16 19:47:28 -07:00
memcontrol.c vmscan: evict use-once pages first 2009-06-16 19:47:38 -07:00
memory_hotplug.c page-allocator: reset wmark_min and inactive ratio of zone when hotplug happens 2009-06-16 19:47:42 -07:00
memory.c mm: introduce follow_pfn() 2009-06-16 19:47:40 -07:00
mempolicy.c page allocator: do not check NUMA node ID when the caller knows the node is valid 2009-06-16 19:47:32 -07:00
mempool.c
migrate.c migration: only migrate_prep() once per move_pages() 2009-06-16 19:47:41 -07:00
mincore.c [CVE-2009-0029] System call wrappers part 14 2009-01-14 14:15:24 +01:00
mlock.c mm: remove CONFIG_UNEVICTABLE_LRU config option 2009-06-16 19:47:42 -07:00
mm_init.c mm: mminit_loglevel cannot be __meminitdata anymore 2008-08-20 15:40:30 -07:00
mmap.c Merge branch 'perfcounters-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-06-11 14:01:07 -07:00
mmu_notifier.c mmu-notifiers: core 2008-07-28 16:30:21 -07:00
mmzone.c [ARM] Double check memmap is actually valid with a memmap has unexpected holes V2 2009-05-18 11:22:24 +01:00
mprotect.c perf_counter: Add mmap event hooks to mprotect() 2009-06-08 23:10:43 +02:00
mremap.c [CVE-2009-0029] System call wrappers part 13 2009-01-14 14:15:23 +01:00
msync.c [CVE-2009-0029] System call wrappers part 13 2009-01-14 14:15:23 +01:00
nommu.c nommu: Provide mmap_min_addr definition. 2009-06-10 09:24:09 +10:00
oom_kill.c oom: move oom_adj value from task_struct to mm_struct 2009-06-16 19:47:43 -07:00
page_alloc.c mm: remove CONFIG_UNEVICTABLE_LRU config option 2009-06-16 19:47:42 -07:00
page_cgroup.c memcg: fix page_cgroup fatal error in FLATMEM 2009-06-12 11:00:54 +03:00
page_io.c block: fix bad definition of BIO_RW_SYNC 2009-02-18 10:32:00 +01:00
page_isolation.c memory hotplug: fix page_zone() calculation in test_pages_isolated() 2008-11-06 15:41:19 -08:00
page-writeback.c mm/page-writeback.c: dirty limit type should be unsigned long 2009-06-16 19:47:31 -07:00
pagewalk.c
pdflush.c Revert "mm: add /proc controls for pdflush threads" 2009-05-15 11:32:24 +02:00
percpu.c percpu: remove rbtree and use page->index instead 2009-04-08 18:31:31 +02:00
prio_tree.c
quicklist.c cpumask: replace node_to_cpumask with cpumask_of_node. 2009-03-13 14:49:46 +10:30
readahead.c readahead: introduce context readahead algorithm 2009-06-16 19:47:30 -07:00
rmap.c mm: remove CONFIG_UNEVICTABLE_LRU config option 2009-06-16 19:47:42 -07:00
shmem_acl.c [PATCH] sanitize ->permission() prototype 2008-07-26 20:53:14 -04:00
shmem.c mm: add swap cache interface for swap reference 2009-06-16 19:47:42 -07:00
slab.c page allocator: slab: use nr_online_nodes to check for a NUMA platform 2009-06-16 19:47:35 -07:00
slob.c page allocator: do not check NUMA node ID when the caller knows the node is valid 2009-06-16 19:47:32 -07:00
slub.c page allocator: use a pre-calculated value instead of num_online_nodes() in fast paths 2009-06-16 19:47:35 -07:00
sparse-vmemmap.c vmemmap: warn about page_structs with remote distance 2008-11-06 15:41:19 -08:00
sparse.c mm: mminit_validate_memmodel_limits(): remove redundant test 2009-04-01 08:59:11 -07:00
swap_state.c mm: modify swap_map and add SWAP_HAS_CACHE flag 2009-06-16 19:47:42 -07:00
swap.c mm: fix Committed_AS underflow on large NR_CPUS environment 2009-05-02 15:36:10 -07:00
swapfile.c mm: reuse unused swap entry if necessary 2009-06-16 19:47:42 -07:00
thrash.c
truncate.c memcg: fix deadlock between lock_page_cgroup and mapping tree_lock 2009-05-29 08:40:02 -07:00
util.c mm: clean up get_user_pages_fast() documentation 2009-06-16 19:47:30 -07:00
vmalloc.c Merge branch 'for-linus' of git://linux-arm.org/linux-2.6 2009-06-11 14:15:57 -07:00
vmscan.c mm: add swap cache interface for swap reference 2009-06-16 19:47:42 -07:00
vmstat.c mm: remove CONFIG_UNEVICTABLE_LRU config option 2009-06-16 19:47:42 -07:00