kernel_optimize_test

Author	SHA1	Message	Date
AKASHI Takahiro	3711784ece	arm64: ftrace: Add CALLER_ADDRx macros CALLER_ADDRx returns caller's address at specified level in call stacks. They are used for several tracers like irqsoff and preemptoff. Strange to say, however, they are refered even without FTRACE. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2014-05-29 09:08:33 +01:00
AKASHI Takahiro	bd7d38dbdf	arm64: ftrace: Add dynamic ftrace support This patch allows "dynamic ftrace" if CONFIG_DYNAMIC_FTRACE is enabled. Here we can turn on and off tracing dynamically per-function base. On arm64, this is done by patching single branch instruction to _mcount() inserted by gcc -pg option. The branch is replaced to NOP initially at kernel start up, and later on, NOP to branch to ftrace_caller() when enabled or branch to NOP when disabled. Please note that ftrace_caller() is a counterpart of _mcount() in case of 'static' ftrace. More details on architecture specific requirements are described in Documentation/trace/ftrace-design.txt. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2014-05-29 09:08:33 +01:00
AKASHI Takahiro	819e50e25d	arm64: Add ftrace support This patch implements arm64 specific part to support function tracers, such as function (CONFIG_FUNCTION_TRACER), function_graph (CONFIG_FUNCTION_GRAPH_TRACER) and function profiler (CONFIG_FUNCTION_PROFILER). With 'function' tracer, all the functions in the kernel are traced with timestamps in ${sysfs}/tracing/trace. If function_graph tracer is specified, call graph is generated. The kernel must be compiled with -pg option so that _mcount() is inserted at the beginning of functions. This function is called on every function's entry as long as tracing is enabled. In addition, function_graph tracer also needs to be able to probe function's exit. ftrace_graph_caller() & return_to_handler do this by faking link register's value to intercept function's return path. More details on architecture specific requirements are described in Documentation/trace/ftrace-design.txt. Reviewed-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2014-05-29 09:08:08 +01:00
AKASHI Takahiro	af64d2aa87	ftrace: Add arm64 support to recordmcount Recordmcount utility under scripts is run, after compiling each object, to find out all the locations of calling _mcount() and put them into specific seciton named __mcount_loc. Then linker collects all such information into a table in the kernel image (between __start_mcount_loc and __stop_mcount_loc) for later use by ftrace. This patch adds arm64 specific definitions to identify such locations. There are two types of implementation, C and Perl. On arm64, only C version is used to build the kernel now that CONFIG_HAVE_C_RECORDMCOUNT is on. But Perl version is also maintained. This patch also contains a workaround just in case where a header file, elf.h, on host machine doesn't have definitions of EM_AARCH64 nor R_AARCH64_ABS64. Without them, compiling C version of recordmcount will fail. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2014-05-29 09:04:31 +01:00
AKASHI Takahiro	26e2ae3999	arm64: Add 'notrace' attribute to unwind_frame() for ftrace walk_stackframe() calls unwind_frame(), and if walk_stackframe() is "notrace", unwind_frame() should be also "notrace". Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2014-05-29 09:04:31 +01:00
AKASHI Takahiro	26e9b83a7d	arm64: add __ASSEMBLY__ in asm/insn.h Since insn.h is indirectly included in asm/entry-ftrace.S, we need to exclude some declarations by __ASSEMBLY__. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2014-05-29 09:04:31 +01:00
Will Deacon	2eb8b396dc	Merge branch 'ftrace/arm64' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into for-next/core Core ftrace changes from Steve Rostedt, required by the arm64 support code.	2014-05-28 18:29:23 +01:00
Geoff Levand	af885f4022	arm64: Fix linker script entry point Change the arm64 linker script ENTRY() command to define _text as the kernel entry point. The arm64 boot protocol specifies that the kernel must be entered at the beginning of the kernel image. The existing ENTRY() command defined the symbol stext as the entry point, which emitted an incorrect entry point, but would not cause a runtime error because the existing entry code immediately jumps to stext. Signed-off-by: Geoff Levand <geoff@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-23 15:32:03 +01:00
zhichang.yuan	0a42cb0a6f	arm64: lib: Implement optimized string length routines This patch, based on Linaro's Cortex Strings library, adds an assembly optimized strlen() and strnlen() functions. Signed-off-by: Zhichang Yuan <zhichang.yuan@linaro.org> Signed-off-by: Deepak Saxena <dsaxena@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-23 15:17:12 +01:00
zhichang.yuan	192c4d902f	arm64: lib: Implement optimized string compare routines This patch, based on Linaro's Cortex Strings library, adds an assembly optimized strcmp() and strncmp() functions. Signed-off-by: Zhichang Yuan <zhichang.yuan@linaro.org> Signed-off-by: Deepak Saxena <dsaxena@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-23 15:16:59 +01:00
zhichang.yuan	d875c9b372	arm64: lib: Implement optimized memcmp routine This patch, based on Linaro's Cortex Strings library, adds an assembly optimized memcmp() function. Signed-off-by: Zhichang Yuan <zhichang.yuan@linaro.org> Signed-off-by: Deepak Saxena <dsaxena@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-23 15:07:57 +01:00
zhichang.yuan	b29a51fe0e	arm64: lib: Implement optimized memset routine This patch, based on Linaro's Cortex Strings library, improves the performance of the assembly optimized memset() function. Signed-off-by: Zhichang Yuan <zhichang.yuan@linaro.org> Signed-off-by: Deepak Saxena <dsaxena@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-23 15:07:48 +01:00
zhichang.yuan	280adc1951	arm64: lib: Implement optimized memmove routine This patch, based on Linaro's Cortex Strings library, improves the performance of the assembly optimized memmove() function. Signed-off-by: Zhichang Yuan <zhichang.yuan@linaro.org> Signed-off-by: Deepak Saxena <dsaxena@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-23 15:07:35 +01:00
zhichang.yuan	808dbac6b5	arm64: lib: Implement optimized memcpy routine This patch, based on Linaro's Cortex Strings library, improves the performance of the assembly optimized memcpy() function. Signed-off-by: Zhichang Yuan <zhichang.yuan@linaro.org> Signed-off-by: Deepak Saxena <dsaxena@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-23 15:06:53 +01:00
Will Deacon	74d2eb3cdb	arm64: defconfig: enable a few more common/useful options in defconfig Whilst our defconfig is certainly usable, there are a few extra features we can enable to make it considerably more useful, particularly if people are using it for testing: - KVM - SWAP - Hugepages - ARMv8 crypto This patch enables these options in our defconfig. Note that the ordering has changed slightly, since this is the result of a new savedefconfig make target. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-22 16:07:13 +01:00
AKASHI Takahiro	eed542d696	ftrace: Make CALLER_ADDRx macros more generic Most archs with HAVE_ARCH_CALLER_ADDR have pretty much the same definitions of CALLER_ADDRx(n). Instead of duplicating the code for all the archs, define a ftrace_return_address0() and ftrace_return_address(n) that can be overwritten by the archs if they need to do something different. Instead of 7 macros in every arch, we now only have at most 2 (and actually only 1 as ftrace_return_address0() should be the same for all archs). The CALLER_ADDRx(n) will now be defined in linux/ftrace.h and use the ftrace_return_address*(n?) macros. This removes a lot of the duplicate code. Link: http://lkml.kernel.org/p/1400585464-30333-1-git-send-email-takahiro.akashi@linaro.org Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-05-21 03:10:32 -04:00
Arun KS	b9acc49ee9	arm64: Fix deadlock scenario with smp_send_stop() If one process calls sys_reboot and that process then stops other CPUs while those CPUs are within a spin_lock() region we can potentially encounter a deadlock scenario like below. CPU 0 CPU 1 ----- ----- spin_lock(my_lock) smp_send_stop() <send IPI> handle_IPI() disable_preemption/irqs while(1); <PREEMPT> spin_lock(my_lock) <--- Waits forever We shouldn't attempt to run any other tasks after we send a stop IPI to a CPU so disable preemption so that this task runs to completion. We use local_irq_disable() here for cross-arch consistency with x86. Based-on-work-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Arun KS <getarunks@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-16 18:03:33 +01:00
Arun KS	90f51a09ef	arm64: Fix machine_shutdown() definition This patch ports most of commit `19ab428f4b` "ARM: 7759/1: decouple CPU offlining from reboot/shutdown" by Stephen Warren from arch/arm to arch/arm64. machine_shutdown() is a hook for kexec. Add a comment saying so, since it isn't obvious from the function name. Halt, power-off, and restart have different requirements re: stopping secondary CPUs than kexec has. The former simply require the secondary CPUs to be quiesced somehow, whereas kexec requires them to be completely non-operational, so that no matter where the kexec target images are written in RAM, they won't influence operation of the secondary CPUS,which could happen if the CPUs were still executing some kind of pin loop. To this end, modify machine_halt, power_off, and restart to call smp_send_stop() directly, rather than calling machine_shutdown(). In machine_shutdown(), replace the call to smp_send_stop() with a call to disable_nonboot_cpus(). This completely disables all but one CPU, thus satisfying the kexec requirements a couple paragraphs above. Signed-off-by: Arun KS <getarunks@gmail.com> Acked-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-16 18:03:18 +01:00
Larry Bassel	eb631bb5bf	arm64: Support arch_irq_work_raise() via self IPIs Support for arch_irq_work_raise() was missing from arm64 (a prerequisite for FULL_NOHZ). This patch is based on the arm32 patch ARM 7872/1. commit `bf18525fd7` Author: Stephen Boyd <sboyd@codeaurora.org> Date: Tue Oct 29 20:32:56 2013 +0100 ARM: 7872/1: Support arch_irq_work_raise() via self IPIs By default, IRQ work is run from the tick interrupt (see irq_work_run() in update_process_times()). When we're in full NOHZ mode, restarting the tick requires the use of IRQ work and if the only place we run IRQ work is in the tick interrupt we have an unbreakable cycle. Implement arch_irq_work_raise() via self IPIs to break this cycle and get the tick started again. Note that we implement this via IPIs which are only available on SMP builds. This shouldn't be a problem because full NOHZ is only supported on SMP builds anyway. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Reviewed-by: Kevin Hilman <khilman@linaro.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Larry Bassel <larry.bassel@linaro.org> Reviewed-by: Kevin Hilman <khilman@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-16 17:42:21 +01:00
Mark Brown	ebdc94470d	arm64: topology: Add support for topology DT bindings Add support for parsing the explicit topology bindings to discover the topology of the system. Since it is not currently clear how to map multi-level clusters for the scheduler all leaf clusters are presented to the scheduler at the same level. This should be enough to provide good support for current systems. Signed-off-by: Mark Brown <broonie@linaro.org> Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-16 17:12:41 +01:00
Mark Brown	c31bf0488d	arm64: topology: Initialise default topology state immediately As a legacy of the way 32 bit ARM did things the topology code uses a null topology map by default and then overwrites it by mapping cores with no information to a cluster by themselves later. In order to make it simpler to reset things as part of recovering from parse failures in firmware information directly set this configuration on init. A core will always be its own sibling so there should be no risk of confusion with firmware provided information. Signed-off-by: Mark Brown <broonie@linaro.org> Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-16 17:12:16 +01:00
Zi Shen Lim	5dd349bab5	arm64: sched: Remove unused mc_capable() and smt_capable() Remove unused and deprecated mc_capable() and smt_capable(). Both were added recently by `f6e763b93a` ("arm64: topology: Implement basic CPU topology support"). Uses of both were removed by `8e7fbcbc22` ("sched: Remove stale power aware scheduling remnants and dysfunctional knobs"). Signed-off-by: Zi Shen Lim <zlim@broadcom.com> Signed-off-by: Mark Brown <broonie@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-16 17:12:04 +01:00
Catalin Marinas	5a0fdfada3	Revert "arm64: Introduce execute-only page access permissions" This reverts commit `bc07c2c6e9`. While the aim is increased security for --x memory maps, it does not protect against kernel level reads. Until SECCOMP is implemented for arm64, revert this patch to avoid giving a false idea of execute-only mappings. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-16 16:44:32 +01:00
Catalin Marinas	cf5c95db57	2014-05-15 for-3.16 pull request -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQF8BAABCgBmBQJTdT+DXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQ5Q0QyQTBEQTZBRDhGNzMzMDE3NUUyQkJD MjM3MjA3RTk1NzRGQTdEAAoJEMI3IH6VdPp9jL8H/3kzXh+5rZycb5r48E6Cic/a Gl0NmRjGtbsxLZcvd8NR3cDol1c9mEAelFrwSA3ar0W91hDf9gsEgxYSBcGKfX/b sxqzhFoArMDitvu8QQ38SMIlXaGokW3sevj4B93ljw9DqhFR/BvJctmVfyNuPLpp fFZh0JRyHnkMkXKMJKtYyiXRgfGiJ90rGHPRZLJW7zCIk0/oYd4LESpnqWC+cGKg w8WIe0Vu4CsiQffkWRmMMJmcJIVayJXAtMHOyBNCKMnYziZSqs+JIUwhbPCxci51 vTxWOFs/CAeZp8mrlHR2TD3dHtzP8c/NSL7E3k4QA0tYReA8T3XHwhRmqtHCp7E= =sllM -----END PGP SIGNATURE----- Merge tag 'for-3.16' of git://git.linaro.org/people/ard.biesheuvel/linux-arm into upstream FPSIMD register bank context switching and crypto algorithms optimisations for arm64 from Ard Biesheuvel. * tag 'for-3.16' of git://git.linaro.org/people/ard.biesheuvel/linux-arm: arm64/crypto: AES-ECB/CBC/CTR/XTS using ARMv8 NEON and Crypto Extensions arm64: pull in <asm/simd.h> from asm-generic arm64/crypto: AES in CCM mode using ARMv8 Crypto Extensions arm64/crypto: AES using ARMv8 Crypto Extensions arm64/crypto: GHASH secure hash using ARMv8 Crypto Extensions arm64/crypto: SHA-224/SHA-256 using ARMv8 Crypto Extensions arm64/crypto: SHA-1 using ARMv8 Crypto Extensions arm64: add support for kernel mode NEON in interrupt context arm64: defer reloading a task's FPSIMD state to userland resume arm64: add abstractions for FPSIMD state manipulation asm-generic: allow generic unaligned access if the arch supports it Conflicts: arch/arm64/include/asm/thread_info.h	2014-05-16 10:05:11 +01:00
Ard Biesheuvel	49788fe2a1	arm64/crypto: AES-ECB/CBC/CTR/XTS using ARMv8 NEON and Crypto Extensions This adds ARMv8 implementations of AES in ECB, CBC, CTR and XTS modes, both for ARMv8 with Crypto Extensions and for plain ARMv8 NEON. The Crypto Extensions version can only run on ARMv8 implementations that have support for these optional extensions. The plain NEON version is a table based yet time invariant implementation. All S-box substitutions are performed in parallel, leveraging the wide range of ARMv8's tbl/tbx instructions, and the huge NEON register file, which can comfortably hold the entire S-box and still have room to spare for doing the actual computations. The key expansion routines were borrowed from aes_generic. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-05-14 10:04:16 -07:00
Ard Biesheuvel	2ca10f892f	arm64: pull in <asm/simd.h> from asm-generic Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>	2014-05-14 10:04:16 -07:00
Ard Biesheuvel	a3fd82105b	arm64/crypto: AES in CCM mode using ARMv8 Crypto Extensions This patch adds support for the AES-CCM encryption algorithm for CPUs that have support for the AES part of the ARM v8 Crypto Extensions. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-05-14 10:04:15 -07:00
Ard Biesheuvel	317f2f750d	arm64/crypto: AES using ARMv8 Crypto Extensions This patch adds support for the AES symmetric encryption algorithm for CPUs that have support for the AES part of the ARM v8 Crypto Extensions. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-05-14 10:04:11 -07:00
Ard Biesheuvel	fdd2389457	arm64/crypto: GHASH secure hash using ARMv8 Crypto Extensions This is a port to ARMv8 (Crypto Extensions) of the Intel implementation of the GHASH Secure Hash (used in the Galois/Counter chaining mode). It relies on the optional PMULL/PMULL2 instruction (polynomial multiply long, what Intel call carry-less multiply). Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-05-14 10:04:07 -07:00
Ard Biesheuvel	6ba6c74dfc	arm64/crypto: SHA-224/SHA-256 using ARMv8 Crypto Extensions This patch adds support for the SHA-224 and SHA-256 Secure Hash Algorithms for CPUs that have support for the SHA-2 part of the ARM v8 Crypto Extensions. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-05-14 10:04:01 -07:00
Ard Biesheuvel	2c98833a42	arm64/crypto: SHA-1 using ARMv8 Crypto Extensions This patch adds support for the SHA-1 Secure Hash Algorithm for CPUs that have support for the SHA-1 part of the ARM v8 Crypto Extensions. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-05-14 10:03:17 -07:00
AKASHI Takahiro	fd92d4a54a	arm64: is_compat_task is defined both in asm/compat.h and linux/compat.h Some kernel files may include both linux/compat.h and asm/compat.h directly or indirectly. Since both header files contain is_compat_task() under !CONFIG_COMPAT, compiling them with !CONFIG_COMPAT will eventually fail. Such files include kernel/auditsc.c, kernel/seccomp.c and init/do_mountfs.c (do_mountfs.c may read asm/compat.h via asm/ftrace.h once ftrace is implemented). So this patch proactively 1) removes is_compat_task() under !CONFIG_COMPAT from asm/compat.h 2) replaces asm/compat.h to linux/compat.h in kernel/*.c, but asm/compat.h is still necessary in ptrace.c and process.c because they use is_compat_thread(). Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-12 16:43:29 +01:00
AKASHI Takahiro	d34a3ebd8d	arm64: Add regs_return_value() in syscall.h This macro, regs_return_value, is used mainly for audit to record system call's results, but may also be used in test_kprobes.c. Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-12 16:43:29 +01:00
AKASHI Takahiro	3157858fef	arm64: split syscall_trace() into separate functions for enter/exit As done in arm, this change makes it easy to confirm we invoke syscall related hooks, including syscall tracepoint, audit and seccomp which would be implemented later, in correct order. That is, undoing operations in the opposite order on exit that they were done on entry. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-12 16:43:29 +01:00
AKASHI Takahiro	449f81a4da	arm64: make a single hook to syscall_trace() for all syscall features Currently syscall_trace() is called only for ptrace. With additional TIF_xx flags defined, it is now called in all the cases of audit, ftrace and seccomp in addition to ptrace. Acked-by: Richard Guy Briggs <rgb@redhat.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-12 16:43:28 +01:00
Will Deacon	2a2830703a	arm64: debug: avoid accessing mdscr_el1 on fault paths where possible Since mdscr_el1 is part of the debug register group, it is highly likely to be trapped by a hypervisor to prevent virtual machines from debugging (buggering?) each other. Unfortunately, this absolutely destroys our performance, since we access the register on many of our low-level fault handling paths to keep track of the various debug state machines. This patch removes our dependency on mdscr_el1 in the case that debugging is not being used. More specifically we: - Use TIF_SINGLESTEP to indicate that a task is stepping at EL0 and avoid disabling step in the MDSCR when we don't need to. MDSCR_EL1.SS handling is moved to kernel_entry, when trapping from userspace. - Ensure debug exceptions are re-enabled on all exception entry paths, even the debug exception handling path (where we re-enable exceptions after invoking the handler). Since we can now rely on MDSCR_EL1.SS being cleared by the entry code, exception handlers can usually enable debug immediately before enabling interrupts. - Remove all debug exception unmasking from ret_to_user and el1_preempt, since we will never get here with debug exceptions masked. This results in a slight change to kernel debug behaviour, where we now step into interrupt handlers and data aborts from EL1 when debugging the kernel, which is actually a useful thing to do. A side-effect of this is that it does potentially prevent stepping off {break,watch}points when there is a high-frequency interrupt source (e.g. a timer), so a debugger would need to use either breakpoints or manually disable interrupts to get around this issue. With this patch applied, guest performance is restored under KVM when debug register accesses are trapped (and we get a measurable performance increase on the host on Cortex-A57 too). Cc: Ian Campbell <ian.campbell@citrix.com> Tested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-12 16:43:28 +01:00
Linus Torvalds	d6d211db37	Linux 3.15-rc5	2014-05-09 13:10:52 -07:00
Linus Torvalds	181da3c34a	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "A somewhat unpleasantly large collection of small fixes. The big ones are the __visible tree sweep and a fix for 'earlyprintk=efi,keep'. It was using __init functions with predictably suboptimal results. Another key fix is a build fix which would produce output that simply would not decompress correctly in some configuration, due to the existing Makefiles picking up an unfortunate local label and mistaking it for the global symbol _end. Additional fixes include the handling of 64-bit numbers when setting the vdso data page (a latent bug which became manifest when i386 started exporting a vdso with time functions), a fix to the new MSR manipulation accessors which would cause features to not get properly unblocked, a build fix for 32-bit userland, and a few new platform quirks" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, vdso, time: Cast tv_nsec to u64 for proper shifting in update_vsyscall() x86: Fix typo in MSR_IA32_MISC_ENABLE_LIMIT_CPUID macro x86: Fix typo preventing msr_set/clear_bit from having an effect x86/intel: Add quirk to disable HPET for the Baytrail platform x86/hpet: Make boot_hpet_disable extern x86-64, build: Fix stack protector Makefile breakage with 32-bit userland x86/reboot: Add reboot quirk for Certec BPC600 asmlinkage: Add explicit __visible to drivers/, lib/, kernel/* asmlinkage, x86: Add explicit __visible to arch/x86/* asmlinkage: Revert "lto: Make asmlinkage __visible" x86, build: Don't get confused by local symbols x86/efi: earlyprintk=efi,keep fix	2014-05-09 12:24:20 -07:00
Will Deacon	dc60b777fc	arm64: mm: use inner-shareable barriers for inner-shareable maintenance In order to ensure ordering and completion of inner-shareable maintenance instructions (cache and TLB) on AArch64, we can use the -ish suffix to the dmb and dsb instructions respectively. This patch updates our low-level cache and tlb maintenance routines to use the inner-shareable barrier variants where appropriate. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-09 17:21:24 +01:00
Will Deacon	ee9e101c11	arm64: kvm: use inner-shareable barriers for inner-shareable maintenance In order to ensure completion of inner-shareable maintenance instructions (cache and TLB) on AArch64, we can use the -ish suffix to the dsb instruction. This patch relaxes our dsb sy instructions to dsb ish where possible. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-09 17:04:24 +01:00
Will Deacon	d0488597a1	arm64: head: fix cache flushing and barriers in set_cpu_boot_mode_flag set_cpu_boot_mode_flag is used to identify which exception levels are encountered across the system by CPUs trying to enter the kernel. The basic algorithm is: if a CPU is booting at EL2, it will set a flag at an offset of #4 from __boot_cpu_mode, a cacheline-aligned variable. Otherwise, a flag is set at an offset of zero into the same cacheline. This enables us to check that all CPUs booted at the same exception level. This cacheline is written with the stage-1 MMU off (that is, via a strongly-ordered mapping) and will bypass any clean lines in the cache, leading to potential coherence problems when the variable is later checked via the normal, cacheable mapping of the kernel image. This patch reworks the broken flushing code so that we: (1) Use a DMB to order the strongly-ordered write of the cacheline against the subsequent cache-maintenance operation (by-VA operations only hazard against normal, cacheable accesses). (2) Use a single dc ivac instruction to invalidate any clean lines containing a stale copy of the line after it has been updated. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-09 17:04:12 +01:00
Will Deacon	be6209a610	arm64: barriers: use barrier() instead of smp_mb() when !SMP The recently introduced acquire/release accessors refer to smp_mb() in the !CONFIG_SMP case. This is confusing when reading the code, so use barrier() directly when we know we're UP. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-09 17:03:52 +01:00
Will Deacon	493e68747e	arm64: barriers: wire up new barrier options Now that all callers of the barrier macros are updated to pass the mandatory options, update the macros so the option is actually used. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-09 17:03:41 +01:00
Will Deacon	98f7685ee6	arm64: barriers: make use of barrier options with explicit barriers When calling our low-level barrier macros directly, we can often suffice with more relaxed behaviour than the default "all accesses, full system" option. This patch updates the users of dsb() to specify the option which they actually require. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-09 17:03:15 +01:00
Steve Capper	fa48e6f780	arm64: mm: Optimise tlb flush logic where we have >4K granule The tlb maintainence functions: __cpu_flush_user_tlb_range and __cpu_flush_kern_tlb_range do not take into consideration the page granule when looping through the address range, and repeatedly flush tlb entries for the same page when operating with 64K pages. This patch re-works the logic s.t. we instead advance the loop by 1 << (PAGE_SHIFT - 12), so avoid repeating ourselves. Also the routines have been converted from assembler to static inline functions to aid with legibility and potential compiler optimisations. The isb() has been removed from flush_tlb_kernel_range(.) as it is only needed when changing the execute permission of a mapping. If one needs to set an area of the kernel as execute/non-execute an isb() must be inserted after the call to flush_tlb_kernel_range. Cc: Laura Abbott <lauraa@codeaurora.org> Signed-off-by: Steve Capper <steve.capper@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-09 17:00:48 +01:00
Will Deacon	e1dfda9ced	arm64: xchg: prevent warning if return value is unused Some users of xchg() don't bother using the return value, which results in a compiler warning like the following (from kgdb): In file included from linux/arch/arm64/include/asm/atomic.h:27:0, from include/linux/atomic.h:4, from include/linux/spinlock.h:402, from include/linux/seqlock.h:35, from include/linux/time.h:5, from include/uapi/linux/timex.h:56, from include/linux/timex.h:56, from include/linux/sched.h:19, from include/linux/pid_namespace.h:4, from kernel/debug/debug_core.c:30: kernel/debug/debug_core.c: In function ‘kgdb_cpu_enter’: linux/arch/arm64/include/asm/cmpxchg.h:75:3: warning: value computed is not used [-Wunused-value] ((__typeof__((ptr)))__xchg((unsigned long)(x),(ptr),sizeof((ptr)))) ^ linux/arch/arm64/include/asm/atomic.h:132:30: note: in expansion of macro ‘xchg’ #define atomic_xchg(v, new) (xchg(&((v)->counter), new)) kernel/debug/debug_core.c:504:4: note: in expansion of macro ‘atomic_xchg’ atomic_xchg(&kgdb_active, cpu); ^ This patch makes use of the same trick as we do for cmpxchg, by assigning the return value to a dummy variable in the xchg() macro itself. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-09 17:00:00 +01:00
Boris Ostrovsky	28b92e09e2	x86, vdso, time: Cast tv_nsec to u64 for proper shifting in update_vsyscall() With tk->wall_to_monotonic.tv_nsec being a 32-bit value on 32-bit systems, (tk->wall_to_monotonic.tv_nsec << tk->shift) in update_vsyscall() may lose upper bits or, worse, add them since compiler will do this: (u64)(tk->wall_to_monotonic.tv_nsec << tk->shift) instead of ((u64)tk->wall_to_monotonic.tv_nsec << tk->shift) So if, for example, tv_nsec is 0x800000 and shift is 8 we will end up with 0xffffffff80000000 instead of 0x80000000. And then we are stuck in the subsequent 'while' loop. We need an explicit cast. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: http://lkml.kernel.org/r/1399648287-15178-1-git-send-email-boris.ostrovsky@oracle.com Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: <stable@vger.kernel.org> # v3.14 Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2014-05-09 08:45:52 -07:00
Andres Freund	c45f77364b	x86: Fix typo in MSR_IA32_MISC_ENABLE_LIMIT_CPUID macro The spuriously added semicolon didn't have any effect because the macro isn't currently in use. `c0a639ad0b` Signed-off-by: Andres Freund <andres@anarazel.de> Link: http://lkml.kernel.org/r/1399598957-7011-3-git-send-email-andres@anarazel.de Cc: Borislav Petkov <bp@suse.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2014-05-09 08:42:47 -07:00
Andres Freund	722a0d22d0	x86: Fix typo preventing msr_set/clear_bit from having an effect Due to a typo the msr accessor function introduced in `22085a66c2` didn't have any lasting effects because they accidentally wrote the old value back. After `c0a639ad0b` this at the very least this causes cpuid limits not to be lifted on some cpus leading to missing capabilities for those. Signed-off-by: Andres Freund <andres@anarazel.de> Link: http://lkml.kernel.org/r/1399598957-7011-2-git-send-email-andres@anarazel.de Cc: Borislav Petkov <bp@suse.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2014-05-09 08:42:32 -07:00
Steve Capper	206a2a73a6	arm64: mm: Create gigabyte kernel logical mappings where possible We have the capability to map 1GB level 1 blocks when using a 4K granule. This patch adjusts the create_mapping logic s.t. when mapping physical memory on boot, we attempt to use a 1GB block if both the VA and PA start and end are 1GB aligned. This both reduces the levels of lookup required to resolve a kernel logical address, as well as reduces TLB pressure on cores that support 1GB TLB entries. Signed-off-by: Steve Capper <steve.capper@linaro.org> Tested-by: Jungseok Lee <jays.lee@samsung.com> [catalin.marinas@arm.com: s/prot_sect_kernel/PROT_SECT_NORMAL_EXEC/] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-09 16:10:58 +01:00

1 2 3 4 5 ...

442107 Commits