kernel_optimize_test

Author	SHA1	Message	Date
Tony Lu	3fa17c395b	tile: support kprobes on tilegx This change includes support for Kprobes, Jprobes and Return Probes. Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Tony Lu <zlu@tilera.com> Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2013-08-30 11:55:53 -04:00
Chris Metcalf	9ae0983847	tile: provide traceability for hypervisor calls This change adds infrastructure (CONFIG_TILE_HVGLUE_TRACE) that provides C code wrappers for the calls the kernel makes to the Tilera hypervisor. This allows standard kernel infrastructure like FTRACE to be able to instrument hypervisor calls. To allow direct calls to the true API, we export their names with a leading underscore as well. This is important for the few contexts where we need to make hypervisor calls without touching the stack. As part of this change, we also switch from creating the symbols with linker magic to creating them with assembler magic. This lets us provide a symbol type and generally make them appear more as symbols and less as just random values in the Elf namespace. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2013-08-13 16:26:31 -04:00
Chris Metcalf	fad052dc4b	tile: avoid struct vm_struct leak If ioreamp_prot() fails in ioremap_page_range() due to kernel memory exhaustion, we previously would leak a struct vm_struct. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2013-08-13 16:26:25 -04:00
Chris Metcalf	4a556f4f56	tile: implement gettimeofday() via vDSO This change creates the framework for vDSO calls, makes the existing rt_sigreturn() mechanism use it, and adds a fast gettimeofday(). Now that we need to expose the vDSO address to userspace, we add AT_SYSINFO_EHDR to the set of aux entries provided to userspace. (You can disable any extra vDSO support by booting with vdso=0, but the rt_sigreturn vDSO page will still be provided.) Note that glibc has supported the tile vDSO since release 2.17. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2013-08-13 16:26:21 -04:00
Chris Metcalf	0c1d1917c5	tile: support simulator notification for ET_DYN objects The tile code notifies the simulator of new ET_EXEC objects starting to execute so that tracing code can properly annotate the objects. However, we didn't support ET_DYN executables like ld.so, so we didn't properly load symbols, etc. This change enables that support; we use a variant of the SIM_CONTROL_DLOPEN simulator notification that newer simulators will recognize and use to set the base address for the next SIM_CONTROL_OS_EXEC notification. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2013-08-13 16:26:17 -04:00
Chris Metcalf	bc1a298f4e	tile: support CONFIG_PREEMPT This change adds support for CONFIG_PREEMPT (full kernel preemption). In addition to the core support, this change includes a number of places where we fix up uses of smp_processor_id() and per-cpu variables. I also eliminate the PAGE_HOME_HERE and PAGE_HOME_UNKNOWN values for page homing, as it turns out they weren't being used. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2013-08-13 16:26:01 -04:00
Chris Metcalf	1182b69cb2	tile: remove calls to arch_flush_lazy_mmu_mode() Since it's a no-op on tile anyway, there's no reason to be calling it in tile-specific code. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2013-08-13 16:25:56 -04:00
Chris Metcalf	a0bd12d718	tile: fix some issues in hugepage support First, in huge_pte_offset(), we were erroneously checking pgd_present(), which is always true, rather than pud_present(), which is the thing that tells us if there is a top-level (L0) PTE. Fixing this means we properly look up huge page entries only when the Present bit is actually set in the PTE. Second, use the standard pte_alloc_map() instead of the hand-rolled pte_alloc_hugetlb() routine that basically was written to avoid worrying about CONFIG_HIGHPTE. However, we no longer plan to support HIGHPTE, so a separate routine was just unnecessary code duplication. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2013-08-13 16:25:52 -04:00
Chris Metcalf	2f9ac29eec	tile: fast-path unaligned memory access for tilegx This change enables unaligned userspace memory access via a kernel fast path on tilegx. The kernel tracks user PC/instruction pairs per-thread using a direct-mapped cache in userspace. The cache maps those PC/instruction pairs to JIT'ed instruction sequences that load or store using byte-wide load store intructions and then synthesize 2-, 4- or 8-byte load or store results. Once an instruction has been seen to generate an unaligned access once, subsequent hits on that instruction typically require overhead of only around 50 cycles if cache and TLB is hot. We support the prctl() PR_GET_UNALIGN / PR_SET_UNALIGN sys call to enable or disable unaligned fixups on a per-process basis. To do this we pull some of the tilepro unaligned support out of the single_step.c file; tilepro uses instruction disassembly for both single-step and unaligned access support. Since tilegx actually has hardware singlestep support, though, it's cleaner to keep the tilegx unaligned access code in a separate file. While we're at it, properly rename the tilepro-specific types, etc., to have tilepro suffixes instead of generic tile suffixes. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2013-08-13 16:04:10 -04:00
Chris Metcalf	e5f7bd4353	tile: fix tilegx vmalloc_sync_all BUG_ON As specified, the test wasn't correct, and in any case it should be a BUILD_BUG_ON. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2013-08-12 14:46:51 -04:00
Michel Lespinasse	98d1e64f95	mm: remove free_area_cache Since all architectures have been converted to use vm_unmapped_area(), there is no remaining use for the free_area_cache. Signed-off-by: Michel Lespinasse <walken@google.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Helge Deller <deller@gmx.de> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-10 18:11:34 -07:00
Johannes Weiner	609838cfed	mm: invoke oom-killer from remaining unconverted page fault handlers A few remaining architectures directly kill the page faulting task in an out of memory situation. This is usually not a good idea since that task might not even use a significant amount of memory and so may not be the optimal victim to resolve the situation. Since 2.6.29's `1c0fe6e` ("mm: invoke oom-killer from page fault") there is a hook that architecture page fault handlers are supposed to call to invoke the OOM killer and let it pick the right task to kill. Convert the remaining architectures over to this hook. To have the previous behavior of simply taking out the faulting task the vm.oom_kill_allocating_task sysctl can be set to 1. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> [arch/arc bits] Cc: James Hogan <james.hogan@imgtec.com> Cc: David Howells <dhowells@redhat.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chen Liqin <liqin.chen@sunplusct.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-09 10:33:20 -07:00
Jiang Liu	3f29c33194	mm/tile: prepare for removing num_physpages and simplify mem_init() Prepare for removing num_physpages and simplify mem_init(). Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-03 16:07:37 -07:00
Jiang Liu	40a3b8df7b	tile: normalize global variables exported by vmlinux.lds Normalize global variables exported by vmlinux.lds to conform usage guidelines from include/asm-generic/sections.h. 1) Use _text to mark the start of the kernel image including the head text, and _stext to mark the start of the .text section. 2) Export mandatory global variables __init_begin and __init_end. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-03 16:07:34 -07:00
Jiang Liu	0c98853473	mm: concentrate modification of totalram_pages into the mm core Concentrate code to modify totalram_pages into the mm core, so the arch memory initialized code doesn't need to take care of it. With these changes applied, only following functions from mm core modify global variable totalram_pages: free_bootmem_late(), free_all_bootmem(), free_all_bootmem_node(), adjust_managed_page_count(). With this patch applied, it will be much more easier for us to keep totalram_pages and zone->managed_pages in consistence. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Acked-by: David Howells <dhowells@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: <sworddragon2@aol.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Michel Lespinasse <walken@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-03 16:07:33 -07:00
Jiang Liu	abd1b6d65f	mm/tile: use common help functions to free reserved pages Use common help functions to free reserved pages. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: <sworddragon2@aol.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Michel Lespinasse <walken@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-03 16:07:33 -07:00
Joonsoo Kim	ef93247325	mm, vmalloc: change iterating a vmlist to find_vm_area() This patchset removes vm_struct list management after initializing vmalloc. Adding and removing an entry to vmlist is linear time complexity, so it is inefficient. If we maintain this list, overall time complexity of adding and removing area to vmalloc space is O(N), although we use rbtree for finding vacant place and it's time complexity is just O(logN). And vmlist and vmlist_lock is used many places of outside of vmalloc.c. It is preferable that we hide this raw data structure and provide well-defined function for supporting them, because it makes that they cannot mistake when manipulating theses structure and it makes us easily maintain vmalloc layer. For kexec and makedumpfile, I export vmap_area_list, instead of vmlist. This comes from Atsushi's recommendation. For more information, please refer below link. https://lkml.org/lkml/2012/12/6/184 This patch: The purpose of iterating a vmlist is finding vm area with specific virtual address. find_vm_area() is provided for this purpose and more efficient, because it uses a rbtree. So change it. Signed-off-by: Joonsoo Kim <js1304@gmail.com> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp> Cc: Dave Anderson <anderson@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-29 15:54:33 -07:00
Shaohua Li	ec8acf20af	swap: add per-partition lock for swapfile swap_lock is heavily contended when I test swap to 3 fast SSD (even slightly slower than swap to 2 such SSD). The main contention comes from swap_info_get(). This patch tries to fix the gap with adding a new per-partition lock. Global data like nr_swapfiles, total_swap_pages, least_priority and swap_list are still protected by swap_lock. nr_swap_pages is an atomic now, it can be changed without swap_lock. In theory, it's possible get_swap_page() finds no swap pages but actually there are free swap pages. But sounds not a big problem. Accessing partition specific data (like scan_swap_map and so on) is only protected by swap_info_struct.lock. Changing swap_info_struct.flags need hold swap_lock and swap_info_struct.lock, because scan_scan_map() will check it. read the flags is ok with either the locks hold. If both swap_lock and swap_info_struct.lock must be hold, we always hold the former first to avoid deadlock. swap_entry_free() can change swap_list. To delete that code, we add a new highest_priority_index. Whenever get_swap_page() is called, we check it. If it's valid, we use it. It's a pity get_swap_page() still holds swap_lock(). But in practice, swap_lock() isn't heavily contended in my test with this patch (or I can say there are other much more heavier bottlenecks like TLB flush). And BTW, looks get_swap_page() doesn't really need the lock. We never free swap_info[] and we check SWAP_WRITEOK flag. The only risk without the lock is we could swapout to some low priority swap, but we can quickly recover after several rounds of swap, so sounds not a big deal to me. But I'd prefer to fix this if it's a real problem. "swap: make each swap partition have one address_space" improved the swapout speed from 1.7G/s to 2G/s. This patch further improves the speed to 2.3G/s, so around 15% improvement. It's a multi-process test, so TLB flush isn't the biggest bottleneck before the patches. [arnd@arndb.de: fix it for nommu] [hughd@google.com: add missing unlock] [minchan@kernel.org: get rid of lockdep whinge on sys_swapon] Signed-off-by: Shaohua Li <shli@fusionio.com> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-23 17:50:17 -08:00
Wen Congyang	24d335ca36	memory-hotplug: introduce new arch_remove_memory() for removing page table For removing memory, we need to remove page tables. But it depends on architecture. So the patch introduce arch_remove_memory() for removing page table. Now it only calls __remove_pages(). Note: __remove_pages() for some archtecuture is not implemented (I don't know how to implement it for s390). Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Wu Jianguo <wujianguo@huawei.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-23 17:50:12 -08:00
Michel Lespinasse	c22c0d6344	mm: remove flags argument to mmap_region After the MAP_POPULATE handling has been moved to mmap_region() call sites, the only remaining use of the flags argument is to pass the MAP_NORESERVE flag. This can be just as easily handled by do_mmap_pgoff(), so do that and remove the mmap_region() flags parameter. [akpm@linux-foundation.org: remove double parens] Signed-off-by: Michel Lespinasse <walken@google.com> Acked-by: Rik van Riel <riel@redhat.com> Tested-by: Andy Lutomirski <luto@amacapital.net> Cc: Greg Ungerer <gregungerer@westnet.com.au> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-23 17:50:11 -08:00
Chris Metcalf	7c63e1ee0a	tile: export a handful of symbols appropriately This was shown up by running with "allmodconfig". I used EXPORT_SYMBOL() to match existing conventions in files that were already exporting symbols, or that were exported that way by other architectures, and otherwise EXPORT_SYMBOL_GPL(). Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2013-02-08 13:20:36 -05:00
Linus Torvalds	9977d9b379	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull big execve/kernel_thread/fork unification series from Al Viro: "All architectures are converted to new model. Quite a bit of that stuff is actually shared with architecture trees; in such cases it's literally shared branch pulled by both, not a cherry-pick. A lot of ugliness and black magic is gone (-3KLoC total in this one): - kernel_thread()/kernel_execve()/sys_execve() redesign. We don't do syscalls from kernel anymore for either kernel_thread() or kernel_execve(): kernel_thread() is essentially clone(2) with callback run before we return to userland, the callbacks either never return or do successful do_execve() before returning. kernel_execve() is a wrapper for do_execve() - it doesn't need to do transition to user mode anymore. As a result kernel_thread() and kernel_execve() are arch-independent now - they live in kernel/fork.c and fs/exec.c resp. sys_execve() is also in fs/exec.c and it's completely architecture-independent. - daemonize() is gone, along with its parts in fs/.c - struct pt_regs is no longer passed to do_fork/copy_process/ copy_thread/do_execve/search_binary_handler/->load_binary/do_coredump. - sys_fork()/sys_vfork()/sys_clone() unified; some architectures still need wrappers (ones with callee-saved registers not saved in pt_regs on syscall entry), but the main part of those suckers is in kernel/fork.c now." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (113 commits) do_coredump(): get rid of pt_regs argument print_fatal_signal(): get rid of pt_regs argument ptrace_signal(): get rid of unused arguments get rid of ptrace_signal_deliver() arguments new helper: signal_pt_regs() unify default ptrace_signal_deliver flagday: kill pt_regs argument of do_fork() death to idle_regs() don't pass regs to copy_process() flagday: don't pass regs to copy_thread() bfin: switch to generic vfork, get rid of pointless wrappers xtensa: switch to generic clone() openrisc: switch to use of generic fork and clone unicore32: switch to generic clone(2) score: switch to generic fork/vfork/clone c6x: sanitize copy_thread(), get rid of clone(2) wrapper, switch to generic clone() take sys_fork/sys_vfork/sys_clone prototypes to linux/syscalls.h mn10300: switch to generic fork/vfork/clone h8300: switch to generic fork/vfork/clone tile: switch to generic clone() ... Conflicts: arch/microblaze/include/asm/Kbuild	2012-12-12 12:22:13 -08:00
Michel Lespinasse	dd5295965b	mm: use vm_unmapped_area() in hugetlbfs on tile architecture Update the tile hugetlb_get_unmapped_area function to make use of vm_unmapped_area() instead of implementing a brute force search. [akpm@linux-foundation.org: fix build] Signed-off-by: Michel Lespinasse <walken@google.com> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-11 17:22:26 -08:00
Chris Metcalf	6b14e4198c	arch/tile: eliminate pt_regs trampolines for syscalls Using the new current_pt_regs() model, we can remove some trampolines from assembly code and call directly to the C syscall implementations. rt_sigreturn() and clone() still need some assembly wrapping, but no longer are passed a pt_regs pointer. sigaltstack() and the tilepro-specific cmpxchg_badaddr() syscalls are now just straight C. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2012-10-23 16:23:58 -04:00
Shaohua Li	45cac65b0f	readahead: fault retry breaks mmap file read random detection .fault now can retry. The retry can break state machine of .fault. In filemap_fault, if page is miss, ra->mmap_miss is increased. In the second try, since the page is in page cache now, ra->mmap_miss is decreased. And these are done in one fault, so we can't detect random mmap file access. Add a new flag to indicate .fault is tried once. In the second try, skip ra->mmap_miss decreasing. The filemap_fault state machine is ok with it. I only tested x86, didn't test other archs, but looks the change for other archs is obvious, but who knows :) Signed-off-by: Shaohua Li <shaohua.li@fusionio.com> Cc: Rik van Riel <riel@redhat.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-09 16:22:47 +09:00
Konstantin Khlebnikov	2dd8ad81e3	mm: use mm->exe_file instead of first VM_EXECUTABLE vma->vm_file Some security modules and oprofile still uses VM_EXECUTABLE for retrieving a task's executable file. After this patch they will use mm->exe_file directly. mm->exe_file is protected with mm->mmap_sem, so locking stays the same. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Acked-by: Chris Metcalf <cmetcalf@tilera.com> [arch/tile] Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> [tomoyo] Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Carsten Otte <cotte@de.ibm.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Eric Paris <eparis@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Acked-by: James Morris <james.l.morris@oracle.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Kentaro Takeda <takedakn@nttdata.co.jp> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Venkatesh Pallipadi <venki@google.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-09 16:22:18 +09:00
Linus Torvalds	84eda28060	Merge branch 'kmap_atomic' of git://github.com/congwang/linux Pull final kmap_atomic cleanups from Cong Wang: "This should be the final round of cleanup, as the definitions of enum km_type finally get removed from the whole tree. The patches have been in linux-next for a long time." * 'kmap_atomic' of git://github.com/congwang/linux: pipe: remove KM_USER0 from comments vmalloc: remove KM_USER0 from comments feature-removal-schedule.txt: remove kmap_atomic(page, km_type) tile: remove km_type definitions um: remove km_type definitions asm-generic: remove km_type definitions avr32: remove km_type definitions frv: remove km_type definitions powerpc: remove km_type definitions arm: remove km_type definitions highmem: remove the deprecated form of kmap_atomic tile: remove usage of enum km_type frv: remove the second parameter of kmap_atomic_primary() jbd2: remove the second argument of kmap_atomic	2012-07-27 11:26:48 -07:00
Cong Wang	61d06c83fc	tile: remove usage of enum km_type Acked-by: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Cong Wang <amwang@redhat.com>	2012-07-23 14:11:23 +08:00
Chris Metcalf	eef015c8aa	arch/tile: enable ZONE_DMA for tilegx This is required for PCI root complex legacy support and USB OHCI root complex support. With this change tilegx now supports allocating memory whose PA fits in 32 bits. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2012-07-18 16:40:11 -04:00
Chris Metcalf	bbaa22c3a0	tilegx pci: support I/O to arbitrarily-cached pages The tilegx PCI root complex support (currently only in linux-next) is limited to pages that are homed on cached in the default manner, i.e. "hash-for-home". This change supports delivery of I/O data to pages that are cached in other ways (locally on a particular core, uncached, user-managed incoherent, etc.). A large part of the change is supporting flushing pages from cache on particular homes so that we can transition the data that we are delivering to or from the device appropriately. The new homecache_finv* routines handle this. Some changes to page_table_range_init() were also required to make the fixmap code work correctly on tilegx; it hadn't been used there before. We also remove some stub mark_caches_evicted_*() routines that were just no-ops anyway. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2012-07-18 16:40:05 -04:00
Chris Metcalf	129622672d	arch/tile: tilegx PCI root complex support This change implements PCIe root complex support for tilegx using the kernel support layer for accessing the TRIO hardware shim. Reviewed-by: Bjorn Helgaas <bhelgaas@google.com> [changes in 07487f3] Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2012-07-18 16:39:11 -04:00
Kautuk Consul	4ce6bea220	tile/mm/fault.c: Port OOM changes to handle_page_fault Commit `d065bd810b` (mm: retry page fault when blocking on disk transfer) and commit `37b23e0525` (x86,mm: make pagefault killable) The above commits introduced changes into the x86 pagefault handler for making the page fault handler retryable as well as killable. These changes reduce the mmap_sem hold time, which is crucial during OOM killer invocation. Port these changes to tile. Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com> [cmetcalf@tilera.com: initialize "flags" after "write" updated.] Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2012-05-25 12:48:29 -04:00
Chris Metcalf	621b195515	arch/tile: support multiple huge page sizes dynamically This change adds support for a new "super" bit in the PTE, using the new arch_make_huge_pte() method. The Tilera hypervisor sees the bit set at a given level of the page table and gangs together 4, 16, or 64 consecutive pages from that level of the hierarchy to create a larger TLB entry. One extra "super" page size can be specified at each of the three levels of the page table hierarchy on tilegx, using the "hugepagesz" argument on the boot command line. A new hypervisor API is added to allow Linux to tell the hypervisor how many PTEs to gang together at each level of the page table. To allow pre-allocating huge pages larger than the buddy allocator can handle, this change modifies the Tilera bootmem support to put all of memory on tilegx platforms into bootmem. As part of this change I eliminate the vestigial CONFIG_HIGHPTE support, which never worked anyway, and eliminate the hv_page_size() API in favor of the standard vma_kernel_pagesize() API. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2012-05-25 12:48:27 -04:00
Chris Metcalf	d5d14ed6f2	arch/tile: Allow tilegx to build with either 16K or 64K page size This change introduces new flags for the hv_install_context() API that passes a page table pointer to the hypervisor. Clients can explicitly request 4K, 16K, or 64K small pages when they install a new context. In practice, the page size is fixed at kernel compile time and the same size is always requested every time a new page table is installed. The <hv/hypervisor.h> header changes so that it provides more abstract macros for managing "page" things like PFNs and page tables. For example there is now a HV_DEFAULT_PAGE_SIZE_SMALL instead of the old HV_PAGE_SIZE_SMALL. The various PFN routines have been eliminated and only PA- or PTFN-based ones remain (since PTFNs are always expressed in fixed 2KB "page" size). The page-table management macros are renamed with a leading underscore and take page-size arguments with the presumption that clients will use those macros in some single place to provide the "real" macros they will use themselves. I happened to notice the old hv_set_caching() API was totally broken (it assumed 4KB pages) so I changed it so it would nominally work correctly with other page sizes. Tag modules with the page size so you can't load a module built with a conflicting page size. (And add a test for SMP while we're at it.) Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2012-05-25 12:48:24 -04:00
Chris Metcalf	51007004f4	arch/tile: use interrupt critical sections less In general we want to avoid ever touching memory while within an interrupt critical section, since the page fault path goes through a different path from the hypervisor when in an interrupt critical section, and we carefully decided with tilegx that we didn't need to support this path in the kernel. (On tilepro we did implement that path as part of supporting atomic instructions in software.) In practice we always need to touch the kernel stack, since that's where we store the interrupt state before releasing the critical section, but this change cleans up a few things. The IRQ_ENABLE macro is split up so that when we want to enable interrupts in a deferred way (e.g. for cpu_idle or for interrupt return) we can read the per-cpu enable mask before entering the critical section. The cache-migration code is changed to use interrupt masking instead of interrupt critical sections. And, the interrupt-entry code is changed so that we defer loading "tp" from per-cpu data until after we have released the interrupt critical section. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2012-05-25 12:48:20 -04:00
Chris Metcalf	b1760c847f	arch/tile: remove bogus performance optimization We were re-homing the initial task's kernel stack on the boot cpu, but in fact it's better to let it stay globally homed, since that task isn't bound to the boot cpu anyway. This is more of a general cleanup than an actual performance optimization, but it removes code, which is a good thing. :-) Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2012-04-02 12:13:59 -04:00
Chris Metcalf	e81510e0c3	arch/tile: export the page_home() function. This avois a bug in modules trying to use the function. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2012-04-02 12:13:42 -04:00
Chris Metcalf	719ea79e33	arch/tile: fix up locking in pgtable.c slightly We should be holding the init_mm.page_table_lock in shatter_huge_page() since we are modifying the kernel page tables. Then, only if we are walking the other root page tables to update them, do we want to take the pgd_lock. Add a comment about taking the pgd_lock that we always do it with interrupts disabled and therefore are not at risk from the tlbflush IPI deadlock as is seen on x86. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2012-04-02 12:13:22 -04:00
Chris Metcalf	7a7039ee71	arch/tile: fix bug in loading kernels larger than 16 MB Previously we only handled kernels up to a single huge page in size. Now we create additional PTEs appropriately. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2012-04-02 12:13:12 -04:00
Chris Metcalf	b230ff2d5c	arch/tile: don't enable irqs unconditionally in page fault handler If we took a page fault while we had interrupts disabled, we shouldn't enable them in the page fault handler. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2012-04-02 12:13:09 -04:00
Chris Metcalf	12400f1f22	arch/tile: don't set the homecache of a PTE unless appropriate We make sure not to try to set the home for an MMIO PTE (on tilegx) or a PTE that isn't referencing memory managed by Linux. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2012-04-02 12:13:05 -04:00
Chris Metcalf	48292738d0	arch/tile: don't wait for migrating PTEs in an NMI handler Doing so raises the possibility of self-deadlock if we are waiting for a backtrace for an oprofile or perf interrupt while we are in the middle of migrating our own stack page. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2012-04-02 12:13:02 -04:00
Chris Metcalf	51bcdf8879	arch/tile: fix a couple of comments that needed updating Not associated with any code changes, so I'm just lumping these comment changes into a commit by themselves. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2012-04-02 12:12:55 -04:00
Linus Torvalds	0195c00244	Disintegrate and delete asm/system.h -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIVAwUAT3NKzROxKuMESys7AQKElw/+JyDxJSlj+g+nymkx8IVVuU8CsEwNLgRk 8KEnRfLhGtkXFLSJYWO6jzGo16F8Uqli1PdMFte/wagSv0285/HZaKlkkBVHdJ/m u40oSjgT013bBh6MQ0Oaf8pFezFUiQB5zPOA9QGaLVGDLXCmgqUgd7exaD5wRIwB ZmyItjZeAVnDfk1R+ZiNYytHAi8A5wSB+eFDCIQYgyulA1Igd1UnRtx+dRKbvc/m rWQ6KWbZHIdvP1ksd8wHHkrlUD2pEeJ8glJLsZUhMm/5oMf/8RmOCvmo8rvE/qwl eDQ1h4cGYlfjobxXZMHqAN9m7Jg2bI946HZjdb7/7oCeO6VW3FwPZ/Ic75p+wp45 HXJTItufERYk6QxShiOKvA+QexnYwY0IT5oRP4DrhdVB/X9cl2MoaZHC+RbYLQy+ /5VNZKi38iK4F9AbFamS7kd0i5QszA/ZzEzKZ6VMuOp3W/fagpn4ZJT1LIA3m4A9 Q0cj24mqeyCfjysu0TMbPtaN+Yjeu1o1OFRvM8XffbZsp5bNzuTDEvviJ2NXw4vK 4qUHulhYSEWcu9YgAZXvEWDEM78FXCkg2v/CrZXH5tyc95kUkMPcgG+QZBB5wElR FaOKpiC/BuNIGEf02IZQ4nfDxE90QwnDeoYeV+FvNj9UEOopJ5z5bMPoTHxm4cCD NypQthI85pc= =G9mT -----END PGP SIGNATURE----- Merge tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system Pull "Disintegrate and delete asm/system.h" from David Howells: "Here are a bunch of patches to disintegrate asm/system.h into a set of separate bits to relieve the problem of circular inclusion dependencies. I've built all the working defconfigs from all the arches that I can and made sure that they don't break. The reason for these patches is that I recently encountered a circular dependency problem that came about when I produced some patches to optimise get_order() by rewriting it to use ilog2(). This uses bitops - and on the SH arch asm/bitops.h drags in asm-generic/get_order.h by a circuituous route involving asm/system.h. The main difficulty seems to be asm/system.h. It holds a number of low level bits with no/few dependencies that are commonly used (eg. memory barriers) and a number of bits with more dependencies that aren't used in many places (eg. switch_to()). These patches break asm/system.h up into the following core pieces: (1) asm/barrier.h Move memory barriers here. This already done for MIPS and Alpha. (2) asm/switch_to.h Move switch_to() and related stuff here. (3) asm/exec.h Move arch_align_stack() here. Other process execution related bits could perhaps go here from asm/processor.h. (4) asm/cmpxchg.h Move xchg() and cmpxchg() here as they're full word atomic ops and frequently used by atomic_xchg() and atomic_cmpxchg(). (5) asm/bug.h Move die() and related bits. (6) asm/auxvec.h Move AT_VECTOR_SIZE_ARCH here. Other arch headers are created as needed on a per-arch basis." Fixed up some conflicts from other header file cleanups and moving code around that has happened in the meantime, so David's testing is somewhat weakened by that. We'll find out anything that got broken and fix it.. * tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits) Delete all instances of asm/system.h Remove all #inclusions of asm/system.h Add #includes needed to permit the removal of asm/system.h Move all declarations of free_initmem() to linux/mm.h Disintegrate asm/system.h for OpenRISC Split arch_align_stack() out from asm-generic/system.h Split the switch_to() wrapper out of asm-generic/system.h Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h Create asm-generic/barrier.h Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h Disintegrate asm/system.h for Xtensa Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt] Disintegrate asm/system.h for Tile Disintegrate asm/system.h for Sparc Disintegrate asm/system.h for SH Disintegrate asm/system.h for Score Disintegrate asm/system.h for S390 Disintegrate asm/system.h for PowerPC Disintegrate asm/system.h for PA-RISC Disintegrate asm/system.h for MN10300 ...	2012-03-28 15:58:21 -07:00
David Howells	bd119c6923	Disintegrate asm/system.h for Tile Disintegrate asm/system.h for Tile. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Chris Metcalf <cmetcalf@tilera.com>	2012-03-28 18:30:03 +01:00
Jason Baron	909af768e8	coredump: remove VM_ALWAYSDUMP flag The motivation for this patchset was that I was looking at a way for a qemu-kvm process, to exclude the guest memory from its core dump, which can be quite large. There are already a number of filter flags in /proc/<pid>/coredump_filter, however, these allow one to specify 'types' of kernel memory, not specific address ranges (which is needed in this case). Since there are no more vma flags available, the first patch eliminates the need for the 'VM_ALWAYSDUMP' flag. The flag is used internally by the kernel to mark vdso and vsyscall pages. However, it is simple enough to check if a vma covers a vdso or vsyscall page without the need for this flag. The second patch then replaces the 'VM_ALWAYSDUMP' flag with a new 'VM_NODUMP' flag, which can be set by userspace using new madvise flags: 'MADV_DONTDUMP', and unset via 'MADV_DODUMP'. The core dump filters continue to work the same as before unless 'MADV_DONTDUMP' is set on the region. The qemu code which implements this features is at: http://people.redhat.com/~jbaron/qemu-dump/qemu-dump.patch In my testing the qemu core dump shrunk from 383MB -> 13MB with this patch. I also believe that the 'MADV_DONTDUMP' flag might be useful for security sensitive apps, which might want to select which areas are dumped. This patch: The VM_ALWAYSDUMP flag is currently used by the coredump code to indicate that a vma is part of a vsyscall or vdso section. However, we can determine if a vma is in one these sections by checking it against the gate_vma and checking for a non-NULL return value from arch_vma_name(). Thus, freeing a valuable vma bit. Signed-off-by: Jason Baron <jbaron@redhat.com> Acked-by: Roland McGrath <roland@hack.frob.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Avi Kivity <avi@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-23 16:58:42 -07:00
Cong Wang	a24401bcf4	highmem: kill all __kmap_atomic() [swarren@nvidia.com: highmem: Fix ARM build break due to __kmap_atomic rename] Signed-off-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Cong Wang <amwang@redhat.com>	2012-03-20 21:48:30 +08:00
Paul E. McKenney	a95f8817f8	tile: Make tile use the new is_idle_task() API Change from direct comparison of ->pid with zero to is_idle_task(). Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Chris Metcalf <cmetcalf@tilera.com>	2011-12-11 10:31:55 -08:00
Chris Metcalf	c2851a9b1c	arch/tile: fix double-free bug in homecache_free_pages() When freeing the page with this API, the page was "put" twice. This was only discovered bringing up an MPT fusion controller, which actually used the API; it hadn't been invoked previously, so the bug had gone unnoticed. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2011-12-03 15:31:47 -05:00
Julia Lawall	d1afa65ca5	arch/tile/mm/init.c: trivial: use BUG_ON Use BUG_ON(x) rather than if(x) BUG(); The semantic patch that fixes this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ identifier x; @@ -if (x) BUG(); +BUG_ON(x); @@ identifier x; @@ -if (!x) BUG(); +BUG_ON(!x); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2011-08-02 16:26:59 -04:00

1 2

79 Commits