The pages allocated by the cmm memory balloon should be freed before
the hibernation image is created. Otherwise the memory reserved by the
balloon gets written to the swap device but there is no content in
these pages that need to be preserved.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Git commit ea43546750 changed the
definition of atomic_t and atomic64_t for s390 by adding the volatile
modifier to the counter field. This has an unfortunate side effect
with newer versions of the gcc. The typeof operator now picks up the
volatile modifier from the expression. This causes the compiler to
think that it has to store the two temporary variable old_val and
new_val in the __CS_LOOP for the different atomic operations to the
stack as the variables are now volatile. Both stores are superfluous.
The hack to replace typeof(ptr->counter) with int in __CS_LOOP and
and long long in __CSG_LOOP avoids the two stores. A better solution
would be to drop the volatile from the counter field of the atomic_t
and atomic64_t definition. But that is a touchy subject ..
Cc: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
the todclk.h header file is dead code. Remove it.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The pagetable walk usercopy functions have used a modified copy of the
do_exception() function for fault handling. This lead to inconsistencies
with recent changes to do_exception(), e.g. performance counters. This
patch changes the pagetable walk usercopy code to call do_exception()
directly, eliminating the redundancy. A new parameter is added to
do_exception() to specify the fault address.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Slim down the do_exception function to handle only the fast path of a
fault and move the exceptional cases into a new function. That slightly
increases the performance of the fault handling.
Build fix for !CONFIG_COMPAT by
Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
notify_page_fault does a preempt_disable/preempt_enable for each
fault generated by a kernel access to user space. If kprobes
is not active that is unnecessary since the interrupts are not
reenabled yet. To play safe repeat the kprobe_running check after
preempt_disable().
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Introduce user_mode to replace the two variables switch_amode and
s390_noexec. There are three valid combinations of the old values:
1) switch_amode == 0 && s390_noexec == 0
2) switch_amode == 1 && s390_noexec == 0
3) switch_amode == 1 && s390_noexec == 1
They get replaced by
1) user_mode == HOME_SPACE_MODE
2) user_mode == PRIMARY_SPACE_MODE
3) user_mode == SECONDARY_SPACE_MODE
The new kernel parameter user_mode=[primary,secondary,home] lets
you choose the address space mode the user space processes should
use. In addition the CONFIG_S390_SWITCH_AMODE config option
is removed.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
A data access in access-register mode always is a user mode access,
the code to inspect the access-registers can be removed. The second
change is to use a different test to check for no-execute fault.
The third change is to pass the translation exception identification
as parameter, in theory the trans_exc_code in the lowcore could have
been overwritten by the time the call to check_space from do_no_context
is done.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Split setting (driver wants feature enabled) and status (feature
setup was successful) for PGID related ccw device features so that
setup errors can be detected. Previously, incorrectly handled setup
errors could in rare cases lead to erratic I/O behavior and
permanently unusuable devices.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When the kernel is IPLed without the CLEAR option and switches
to 64-bit, the high-order half of the registers might contain
random values. This can cause addressing exceptions and the
kernel enters an interrupt loop.
Initialize the high-order half of the general purpose registers
with zeros after switching to 64-bit mode.
Cc: <stable@kernel.org>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (40 commits)
tracing: Separate raw syscall from syscall tracer
ring-buffer-benchmark: Add parameters to set produce/consumer priorities
tracing, function tracer: Clean up strstrip() usage
ring-buffer benchmark: Run producer/consumer threads at nice +19
tracing: Remove the stale include/trace/power.h
tracing: Only print objcopy version warning once from recordmcount
tracing: Prevent build warning: 'ftrace_graph_buf' defined but not used
ring-buffer: Move access to commit_page up into function used
tracing: do not disable interrupts for trace_clock_local
ring-buffer: Add multiple iterations between benchmark timestamps
kprobes: Sanitize struct kretprobe_instance allocations
tracing: Fix to use __always_unused attribute
compiler: Introduce __always_unused
tracing: Exit with error if a weak function is used in recordmcount.pl
tracing: Move conditional into update_funcs() in recordmcount.pl
tracing: Add regex for weak functions in recordmcount.pl
tracing: Move mcount section search to front of loop in recordmcount.pl
tracing: Fix objcopy revision check in recordmcount.pl
tracing: Check absolute path of input file in recordmcount.pl
tracing: Correct the check for number of arguments in recordmcount.pl
...
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
mutex: Fix missing conditions to build mutex_spin_on_owner()
mutex: Better control mutex adaptive spinning config
locking, task_struct: Reduce size on TRACE_IRQFLAGS and 64bit
locking: Use __[SPIN|RW]_LOCK_UNLOCKED in [spin|rw]_lock_init()
locking: Remove unused prototype
locking: Reduce ifdefs in kernel/spinlock.c
locking: Make inlining decision Kconfig based
Use the new unreachable() macro instead of for(;;);
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
CC: Heiko Carstens <heiko.carstens@de.ibm.com>
CC: linux390@de.ibm.com
CC: linux-s390@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
commit 892a7c67 (locking: Allow arch-inlined spinlocks) implements the
selection of which lock functions are inlined based on defines in
arch/.../spinlock.h: #define __always_inline__LOCK_FUNCTION
Despite of the name __always_inline__* the lock functions can be built
out of line depending on config options. Also if the arch does not set
some inline defines the generic code might set them; again depending on
config options.
This makes it unnecessary hard to figure out when and which lock
functions are inlined. Aside of that it makes it way harder and
messier for -rt to manipulate the lock functions.
Convert the inlining decision to CONFIG switches. Each lock function
is inlined depending on CONFIG_INLINE_*. The configs implement the
existing dependencies. The architecture code can select ARCH_INLINE_*
to signal that it wants the corresponding lock function inlined.
ARCH_INLINE_* is necessary as Kconfig ignores "depends on"
restrictions when a config element is selected.
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <20091109151428.504477141@linutronix.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
On s390 there are two ways of specifying the system call number for
the svc instruction. The standard way is to use the immediate field
in the instruction (or to use EXecute for values unknown during
assemble time). This can encode 256 system calls.
The kernel ABI also allows to put the system call number in r1 and
then execute svc 0 to enable system call numbers > 255.
It turns out that single stepping svc 0 is broken, since the PER
program check handler uses r1. We have to use a different register.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
After an IPL from NSS the uptime of the system is incorrect. The reason
is that the startup code in head.S is not executed in case of an IPL
from NSS. Due to that sched_clock_base_cc which is used to initialze
wall_to_monotonic contains the time stamp when the NSS has been created
instead of the time stamp of the system start.
Reinitialize the cputime accounting values in create_kernel_nss after
the SAVESYS CP command that created the NSS segment.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
sigp sense only returns the status of a cpu if it is non zero. If the
status of the sensed cpu is all zeros condition code 0 (accpeted) is
set and no status bits are returned.
The current code however assumes that a status was returned and tests
bits in it. This means uninitalized data is accessed with random
results.
Worst case is that the code that checks if cpu is offline on cpu
hotplug assumes that the target cpu is offline while it is still
running. This leads potentially to memory corruption since resources
that are still needed by the target cpu will be freed and could be
resused while still in use.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
According to the architecture a cpu must not necessarily enter stopped
state after completion of a sigp instruction with "stop" order code.
So remove the BUG() statement after self sending sigp stop to avoid
that it ever gets reached.
Also add a sigp busy check to make sure that the order gets delivered.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Offlined cpus still have valid prefix register contents. Dumpers
will store the register contents of a cpu to the location where its
prefix register points to.
For offlined cpus the area (lowcore) has been freed and the dumper
would write the uninteresting contents of the offline cpu to a memory
location which might be in use by some other component and destroy
valueable information.
To fix this set the prefix register of offline cpus to absolute
address zero again. This prevents the current dumpers to write to
random memory locations.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch makes the hwcap bit for the high gprs feature to be visible
in /proc/cpuinfo.
Signed-off-by: Andreas Krebbel <Andreas.Krebbel@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Hypfs never worked on systems that only provide D204 subcode 6.
In these cases we nevertheless used subcode 7. With this fix, we
use subcode 6, if it is available and the system does not provide
subcode 7.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Most of the syscalls metadata processing is done from arch.
But these operations are mostly generic accross archs. Especially now
that we have a common variable name that expresses the number of
syscalls supported by an arch: NR_syscalls, the only remaining bits
that need to reside in arch is the syscall nr to addr translation.
v2: Compare syscalls symbols only after the "sys" prefix so that we
avoid spurious mismatches with archs that have syscalls wrappers,
in which case syscalls symbols have "SyS" prefixed aliases.
(Reported by: Heiko Carstens)
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
This patch adds an EX_TABLE entry to mvc{p|s|os} usercopy functions that
may be called with KERNEL_DS. In combination with collaborative memory
management, kernel pages marked as unused may trigger an adressing exception
in the usercopy functions. This fixes an unhandled addressing exception bug
where strncpy_from_user() is used with len > strnlen and KERNEL_DS, crossing
a page boundary to an unused page.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We used address 0x1084 instead of 0x84 to store the suspend CPU address.
With this patch we use the correct address 0x84 as it is defined in
the POP.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The time a system has been suspended should not show up in any
of the cputime accounting fields. The time of inactivity is definitly
not any form of real cputime nor is it idle time.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The function graph tracer used to have a protection against NMI
while entering a function entry tracing. But this is useless now,
the tracer is reentrant and the ring buffer supports NMI tracing.
Same as 07868b086c for x86.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The system call takes a signed length parameter. So perform sign
extension instead of zero extension.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use an own implementation instead of the common code udelay loop.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When udelay() gets called with a delay that would expire before the
next clock event it reprograms the clock comparator.
When the interrupt happens the clock comparator won't be resetted
therefore the interrupt condition doesn't get cleared.
The result is an endless timer interrupt loop until the next clock
event would expire (stored in lowcore).
So udelay() usually would wait much longer for small delays than it
should.
Fix this by disabling the local tick which makes sure that the clock
comparator will be resetted when a timer interrupt happens.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The s390 version of module_frob_arch_sections allocates additional
syminfos for got and plt offsets. These syminfos are freed on
sucessful module load. If the module fails to load (e.g. missing
dependency when using insmod instead of modprobe) this area is not
freed.
This patch lets module_free free this area. Please note, we have to
set the pointer to NULL since module_free is called several times
from the generic code.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Also increase the maximum possible kmemleak early log entries since
2000 are not sufficient on s390.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
next-20090925 randconfig build breaks on s390x, with CONFIG_AIO=n.
arch/s390/mm/pgtable.c: In function 's390_enable_sie':
arch/s390/mm/pgtable.c:282: error: 'struct mm_struct' has no member named 'ioctx_list'
arch/s390/mm/pgtable.c:298: error: 'struct mm_struct' has no member named 'ioctx_list'
make[1]: *** [arch/s390/mm/pgtable.o] Error 1
Reported-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
commit 628eb9b8a8
KVM: s390: streamline memslot handling
introduced kvm_s390_vcpu_get_memsize. This broke guests >=4G, since this
function returned an int.
This patch changes the return value to a long.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
It's unused.
It isn't needed -- read or write flag is already passed and sysctl
shouldn't care about the rest.
It _was_ used in two places at arch/frv for some reason.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (39 commits)
cpumask: Move deprecated functions to end of header.
cpumask: remove unused deprecated functions, avoid accusations of insanity
cpumask: use new-style cpumask ops in mm/quicklist.
cpumask: use mm_cpumask() wrapper: x86
cpumask: use mm_cpumask() wrapper: um
cpumask: use mm_cpumask() wrapper: mips
cpumask: use mm_cpumask() wrapper: mn10300
cpumask: use mm_cpumask() wrapper: m32r
cpumask: use mm_cpumask() wrapper: arm
cpumask: Use accessors for cpu_*_mask: um
cpumask: Use accessors for cpu_*_mask: powerpc
cpumask: Use accessors for cpu_*_mask: mips
cpumask: Use accessors for cpu_*_mask: m32r
cpumask: remove arch_send_call_function_ipi
cpumask: arch_send_call_function_ipi_mask: s390
cpumask: arch_send_call_function_ipi_mask: powerpc
cpumask: arch_send_call_function_ipi_mask: mips
cpumask: arch_send_call_function_ipi_mask: m32r
cpumask: arch_send_call_function_ipi_mask: alpha
cpumask: remove obsolete topology_core_siblings and topology_thread_siblings: ia64
...
* remove asm/atomic.h inclusion from linux/utsname.h --
not needed after kref conversion
* remove linux/utsname.h inclusion from files which do not need it
NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however
due to some personality stuff it _is_ needed -- cowardly leave ELF-related
headers and files alone.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We're weaning the core code off handing cpumask's around on-stack.
This introduces arch_send_call_function_ipi_mask().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next: (30 commits)
Use macros for .data.page_aligned section.
Use macros for .bss.page_aligned section.
Use new __init_task_data macro in arch init_task.c files.
kbuild: Don't define ALIGN and ENTRY when preprocessing linker scripts.
arm, cris, mips, sparc, powerpc, um, xtensa: fix build with bash 4.0
kbuild: add static to prototypes
kbuild: fail build if recordmcount.pl fails
kbuild: set -fconserve-stack option for gcc 4.5
kbuild: echo the record_mcount command
gconfig: disable "typeahead find" search in treeviews
kbuild: fix cc1 options check to ensure we do not use -fPIC when compiling
checkincludes.pl: add option to remove duplicates in place
markup_oops: use modinfo to avoid confusion with underscored module names
checkincludes.pl: provide usage helper
checkincludes.pl: close file as soon as we're done with it
ctags: usability fix
kernel hacking: move STRIP_ASM_SYMS from General
gitignore usr/initramfs_data.cpio.bz2 and usr/initramfs_data.cpio.lzma
kbuild: Check if linker supports the -X option
kbuild: introduce ld-option
...
Fix trivial conflict in scripts/basic/fixdep.c
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: (22 commits)
[S390] Update default configuration.
[S390] hibernate: Do real CPU swap at resume time
[S390] dasd: tolerate devices that have no feature codes
[S390] zcrypt: Do not add/remove devices in s/r callbacks
[S390] hibernate: make sure pfn_is_nosave handles lowcore pages
[S390] smp: introduce LC_ORDER and simplify lowcore handling
[S390] ptrace: use common code for simple peek/poke operations
[S390] fix disabled_wait inline assembly clobber list
[S390] Change kernel_page_present coding style.
[S390] hibernation: reset system after resume
[S390] hibernation: fix guest page hinting related crash
[S390] Get rid of init_module/delete_module compat functions.
[S390] Convert sys_execve to function with parameters.
[S390] Convert sys_clone to function with parameters.
[S390] qdio: change state of all primed input buffers
[S390] qdio: reduce per device debug messages
[S390] cio: introduce consistent subchannel scanning
[S390] cio: idset use actual number of ssids
[S390] cio: dont kfree vmalloced memory
[S390] cio: introduce css_settle
...
Currently, when the physical resume CPU is not equal to the physical suspend
CPU, we swap the CPUs logically, by modifying the logical/physical CPU mapping.
This has two major drawbacks: First the change is visible from user space (e.g.
CPU sysfs files) and second it is hard to ensure that nowhere in the kernel
the physical CPU ID is stored before suspend.
To fix this, we now really swap the physical CPUs, if the resume CPU is not
the pysical suspend CPU. We restart the suspend CPU and stop the resume CPU
using SIGP restart and SIGP stop. If the suspend CPU is no longer available,
we write a message and load a disabled wait PSW.
Signed-off-by: Michael Holzheu <michael.holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
pfn_is_nosave doesn't return the correct value for the second lowcore
page if lowcore protection is enabled. Make sure it always returns
the correct value.
While at it simplify the whole thing.
NSS special handling is done by the tprot check like it already works
for DCSS as well. So remove the extra code for NSS.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>