Patch from Charles Spirakis
Some linux customers want to optimize their applications on the latest
hardware but are not yet willing to upgrade to the latest kernel. This
patch provides a way to plug in an alternate, basic, and GPL'ed PMU
subsystem to help with their monitoring needs or for specialty work. It
can also be used in case of serious unexpected bugs in perfmon. Mutual
exclusion between the two subsystems is guaranteed, hence no conflict
can arise from both subsystem being present.
Acked-by: Stephane Eranian <eranian@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The 2.6 kernel has CPE error thresholding.
This patch lets SAL know of this error handling feature.
The changes are SN specific.
Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
acpi_request_vector() is called in ia64_mca_init() to get the cpe_vector.
The problem is that acpi_request_vector() looks in platform_intr_list[] to
get the vector, but platform_intr_list[] is not initialized with a valid
vector until later (in sn_setup()). Without a valid vector the code
defaults to polling mode.
This patch moves the call to acpi_request_vector() from ia64_mca_init()
to ia64_mca_late_init(), which is after platform_intr_list[] is initialized.
Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
convert_to_non_syscall() has the same problem that unwind_to_user()
used to have. Fix it likewise.
Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
These days <linux/ioctl32.h> handles everything, no need for an asm
header on just two architectures.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use helper functions to convert between timeval structure and jiffies
rather than custom logic.
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Domen Puncer <domen@coderock.org>
Acked-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch removes the assumption that LAPIC entries contain the BSP as its
first entry. This is a slight improvement to the temporary fix submitted by
Suresh Siddha.
- Removes assumption that LAPIC entries contain BSP first.
- Builds x86_acpiid_to_apicid[] and bios_cpu_apicid[] properly with BSP as
first entry.
- Made maxcpus=1 boot on these systems. Since the parsing earlier in
arch/x86_64/kernel/mpparse.c stopped after maxcpus entries, other entries
were not processed, this causes kernel not to boot on these systems.
TBD: x86_acpiid_to_apicid and bios_cpu_apicid[] seem to be exactly the
same. This could be removed, but might need more work to cleanup.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Collected NMI watchdog fixes.
- Fix call of check_nmi_watchdog
- Remove earlier move of check_nmi_watchdog to later. It does not fix the
race it was supposed to fix fully.
- Remove unused P6 definitions
- Add support for performance counter based watchdog on P4 systems.
This allows to run it only once per second, which saves some CPU time.
Previously it would run at 1000Hz, which was too much.
Code ported from i386
Make this the default on Intel systems.
- Use check_nmi_watchdog with local APIC based nmi
- Fix race in touch_nmi_watchdog
- Fix bug that caused incorrect performance counters to be programmed in a
few cases on K8.
- Remove useless check for local APIC
- Use local_t and per_cpu variables for per CPU data.
- Keep other CPUs busy during check_nmi_watchdog to make sure they really
tick when in lapic mode.
- Only check CPUs that are actually online.
- Various other fixes.
- Fix fallback path when MSRs are unimplemented
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Originally from Matt Tolentino
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use bitmap_zero instead of bitmap_empty to initialise cpu mask This makes it
actually run reliable instead of relying on stack state.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The PTEs can point to ioremap mappings too, and these are often outside
mem_map. The NUMA hash page lookup functions cannot handle out of bounds
accesses properly.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Allowed user programs to set a non canonical segment base, which would cause
oopses in the kernel later.
Credit-to: Alexander Nyberg <alexn@dsv.su.se>
For identifying and reporting this bug.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This works around an AMD Erratum.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There are unfortunately more and more multi processor Opteron systems which
don't have HPET timer support in the southbridge. This covers in particular
Nvidia and VIA chipsets. They also don't guarantee that the TSCs are
synchronized between CPUs; and especially with MP powernow the systems are
nearly unusable because the time gets very inconsistent between CPUs.
The timer code for x86-64 was originally written under the assumption that we
could fall back to the HPET timer on such systems. But this doesn't work
there.
Another alternative is to use the ACPI PM timer as primary time source. This
patch does that. The kernel only uses PM timer when there is no other choice
because it has some disadvantages.
Ported over from i386. It should be faster than the i386 version because I
dropped the "read three times" workaround, but is still considerable slower
than HPET and also does not work together with vsyscalls which have to be
disabled.
Cc: <mark.langsdorf@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It is unnecessary on modern Intel or AMD systems, and that is all we support
on x86-64
Also causes problems on various systems
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It is not very useful to the user and more an kernel internal implementation
detail. So hide it.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The new TSC sync algorithm recently submitted did not work too well.
The result was that some MP machines where the TSC came up of the BIOS very
unsynchronized and that did not have HPET support were nearly unusable because
the time would jump forwards and backwards between CPUs.
After a lot of research ;-) and some more prototypes I ended up with just
using the one from IA64 which looks best. It has some internal self tuning
that should adapt to changing interconnect latencies. It holds up in my tests
so far.
I believe it was originally written by David Mosberger, I just ported it over
to x86-64. See the inline comment for a description.
This cleans up the code because it uses smp_call_function for syncing instead
of having custom hooks in SMP bootup.
Please note that the cycle numbers it outputs are too optimistic because they
do not take into account the latency of WRMSR and RDTSC, which can be hundreds
of cycles. It seems to be able to sync a dual Opteron to 200-300 cycles,
which is probably good enough.
There is a timing window during AP bootup where interrupts can see
inconsistent time before the TSC is synced. It is hard to avoid unfortunately
because we can only do the TSC sync after some setup, and we need to enable
interrupts before that. I just ignored it for now.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It could be in a memory hole not mapped in mem_map and that causes the hash
lookup to go off to nirvana.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Last round hopefully of cpu_core_id changes hopefully fow now:
- Always initialize cpu_core_id for all CPUs, even when no dual core setup
is detected. This prevents funny /proc/cpuinfo output
- Do the same with phys_proc_id[] even when no HyperThreading - dito.
- Use the CPU APIC-ID from CPUID 1 instead of the linux virtual CPU number
to identify the core for AMD dual core setups.
Patch for i386/x86-64.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Cleans up the system exit call slightly and synchronizes with my tree again.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
NR_CPUs can be quite big these days. kmalloc the per CPU array instead of
putting it onto the stack
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Replace one memcpy() call with overlapping source and dest arguments with
one call to memmove(), to avoid data corruption.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Ben Dooks
Fix the setting of hdiv when set to divide-by-2. Thanks to
Jeonghoon Yoon for pointing this out.
Change name of the NAND device to "s3c2440-nand" as it
is not similar enough to the "s3c2410-nand" device.
Signed-off-by: Ben Dooks
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
S3C2440 UPLL is the same as the S3C2410 UPLL, it is only the
MPLL which has an extra multiplication factor of 2 in the
multiplier.
Signed-off-by: Ben Dooks
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Nicolas Pitre
Not all ARMv6 processors implement the TLS register.
Signed-off-by: Nicolas Pitre
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Some time ago, GAS was fixed to bring the .spillpsp directive in line
with the Intel assembler manual (there was some disagreement as to
whether or not there is a built-in 16-byte offset). Unfortunately,
there are two places in the kernel where this directive is used in
handwritten assembly files and those of course relied on the "buggy"
behavior. As a result, when using a "fixed" assembler, the kernel
picks up the UNaT bits from the wrong place (off by 16) and randomly
sets NaT bits on the scratch registers. This can be noticed easily by
looking at a coredump and finding various scratch registers with
unexpected NaT values. The patch below fixes this by using the
.spillsp directive instead, which works correctly no matter what
assembler is in use.
Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Move the locking for copy_user_page() and clear_user_page() into
the implementations which require locking. For simple memcpy/
memset based implementations, the locking is extra overhead which
is not necessary, and prevents preemption occuring.
Signed-off-by: Russell King <rmk@arm.linux.org.uk>
Add pmd_off() and pmd_off_k() to obtain the pmd pointer for a
virtual address, and use them throughout the mm initialisation.
Signed-off-by: Russell King <rmk@arm.linux.org.uk>
I noticed this typo when trying to compile a kernel which had
CONFIG_HOTPLUG turned off. In that case, __devinit is no longer a
no-op and the compiler then detects a section-conflict. Fix by using
__devinitdata instead of __devinit.
Same patch also submitted by Darren Williams to fix compilation error
using sim_defconfig (which has CONFIG_HOTPLUG=n).
Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>
Signed-off-by: Darren Williams <dsw@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This fixes some x86_64 bugs -
- maybe_map returns -1 on error instead of 0, which is interpreted as
physical address 0
- removed an include of ipc.h, which isn't needed
- fixed the calculation of signal frame location
- the signal delivery code is now immune to the stack expansion check
- added a missing include
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
tt-mode closes switch_pipes in exit_thread_tt and kills processes in
switch_to_tt, if the exit_state is EXIT_DEAD or EXIT_ZOMBIE.
In very rare cases the exiting process can be scheduled out after having set
exit_state and closed switch_pipes (from release_task it calls proc_pid_flush,
which might sleep). If this process is to be restarted, UML failes in
switch_to_tt with:
write of switch_pipe failed, err = 9
We fix this by closing switch_pipes not in exit_thread_tt, but later in
release_thread_tt. Additionally, we set switch_pipe[0] = 0 after closing.
switch_to_tt must not kill "from" process depending on its exit_state, but
must kill it after release_thread was processed only, so it examines
switch_pipe[0] for its decision.
Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Only x86 and x86_64 use arch_align_stack(), all other subarches have:
#define arch_align_stack(x) (x)
So, if this definition is found, UML's own arch_align_stack() should be
skipped.
Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
tt/mem.c still uses hardcoded TOP for i386 instead of CONFIG_TOP_ADDR provided
by subarch's Kconfig_XXXX, which would be right.
Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
So, there I was, looking at my own code, wondering what the magic setjmp
return values did. This patch turns the constants that are used to make
requests of the initial thread into meaningful symbols.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This eliminates some stuff from arch/um/kernel/Makefile which refers to a
file which has long since been deleted.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Eliminate the non-inline version of switch_mm, which can't be used,
considering the inline version in asm/mmu_context.h
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
s390 tt-mode needs to save not only syscall number, but an further register
also.
Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
s390 needs to change some parts of arch/um/kernel/ptrace.c. Thus, the code
regarding PEEKUSER and POKEUSER are shifted to arch/um/sys-<subarch>/ptrace.c.
Also s390 debug registers need to be updated, when singlestepping is switched
on / off. Thus, setting/resetting of singlestepping is centralized in the new
function set_singlestep(), which also inserts the macro
SUBARCH_SET_SINGLESTEP(mode), if defined.
Finally, s390 has the "ieee_instruction_pointer" in its
registers, which also is allowed to be read via
ptrace( PTRACE_PEEKUSER, getpid(), PT_IEEE_IP, 0);
To implement this feature, sys_ptrace inserts the macro
SUBARCH_PTRACE_SPECIAL, if defined.
Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Command line handling cleanups - a couple of things made static and an
unused declaration removed from header.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove the __deprecated from verify_area_skas and verify_area_tt. Since
verify_area is itself marked __deprecated, and it is the only caller of
these, then they don't need to be marked. Marking them only makes the
build noisier.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>