remove remainder of additional_cpus logic. We now just listen to the
disabled_cpus value like we did for years. disabled_cpus is always >=
0 so no need for an extra check.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
additional_cpus=<x> parameter is dangerous and broken: for example
if we boot additional_cpus=-2 on a stock dual-core system it will
crash the box on bootup.
So reduce the maze of code a bit by removingthe user-configurability
angle.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
num_possible_cpus() can be > 1 when disabled CPUs have been accounted.
Disabled CPUs are not in the cpu_present_map, so we can use
num_present_cpus() as a safe indicator to switch to UP alternatives.
Reported-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- define STACKSLOTS_PER_LINE and use it
- define get_bp macro to hide the %%ebp/%%rbp difference
- i386: check task==NULL in dump_trace, like x86_64
- i386: show_trace(NULL, ...) uses current automatically
- x86_64: use [#%d] for die_counter, like i386
- whitespace and comments
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- make kstack= and early_param
- add oops=panic, setting panic_on_oops
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- x86: Write log_lvl strings if available
- start raw stack dumps on new line
- i386: Remove extra indentation for raw stack dumps
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- i386 and x86_64: always printk the 'data' parameter
- i386: announce stack switch (irq -> normal)
- i386: check if there is a stack switch before announcing it
There is a warning that 'context' might come out corrupt in early
boot. If this is true it should be fixed, not worked around.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- Add "end" parameter to valid_stack_ptr and print_context_stack
- use sizeof(long) as the size of a word on the stack
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- x86_64: use %p to print an address
- make i386-version the same as the above
The result should be the same on x86_64; on i386 the
output only changes if CONFIG_KALLSYMS is turned off,
in which case the address is printed twice.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
For some reason die_nmi is still defined in traps.c for
i386, but is found in dumpstack_64.c for x86_64. Move it
to dumpstack_32.c
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
traps_32.c and traps_64.c are now equal. Move one to traps.c,
delete the other one and change the Makefile
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use CONFIG_X86_64/CONFIG_X86_32 to condtionally compile the
parts needed for x86_64 or i386 only.
Runs a small userspace for a number of minimal configurations
and boots the defconfigs.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use task_pid_nr(tsk) instead of tsk->pid in do_general_protection.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This is the last user of clear_mem_error, which is defined
only on i386. Expand the inline function and remove it from
include/asm-x86/mach-default/mach_traps.h
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use preempt_conditional_sti/cli in do_int3, like on x86_64.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- rename variable me -> tsk
- get thread and tsk like i386
- expand used_math()
- copy comment
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- set_system_gate on i386 is really set_system_trap_gate
- set_system_gate on x86_64 is really set_system_intr_gate
- ist=0 means no special stack switch is done:
- introduce STACKFAULT_STACK, DOUBLEFAULT_STACK, NMI_STACK,
DEBUG_STACK and MCE_STACK as on x86_64.
- use the _ist variants with XXX_STACK set to zero
- remove set_system_gate
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
traps: x86: correct copy/paste bug: a trap is a GATE_TRAP
Fix copy/paste/forgot-to-edit bug in desc.h.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Make the x86_64-version and the i386-version of do_debug
more similar.
- introduce preempt_conditional_sti/cli to i386. The preempt-count
is now elevated during the trap handler, like on x86_64. It
does not run on a separate stack, however.
- replace an open-coded "send_sigtrap"
- copy some comments
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mark the exception handlers with "dotraplinkage" to hide the
calling convention differences between i386 and x86_64.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86_64 does not do the lazy io-bitmap dance. Putting it in
its own function makes i386's do_general_protection look
much more like x86_64's.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Split out math_error from do_coprocessor_error and simd_math_error
from do_simd_coprocessor_error, like on i386. While at it, add the
"error_code" parameter to do_coprocessor_error, do_simd_coprocessor_error
and do_spurious_interrupt_bug.
This does not change the generated code, but brings the declarations in
line with all the other trap handlers.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The dumpstack code is logically quite independent from the
hardware traps. Split it out into its own file.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The dumpstack code is logically quite independent from the
hardware traps. Split it out into its own file.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
virt_addr_valid() calls __pa(), which calls __phys_addr(). With
CONFIG_DEBUG_VIRTUAL=y, __phys_addr() will kill the kernel if the
address *isn't* valid. That's clearly wrong for virt_addr_valid().
We also incorporate the debugging checks into virt_addr_valid().
Signed-off-by: Vegard Nossum <vegardno@ben.ifi.uio.no>
Acked-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86: allow number of additional hotplug CPUs to be set at compile time, V2
The default number of additional CPU IDs for hotplugging is determined
by asking ACPI or mptables how many "disabled" CPUs there are in the
system, but many systems get this wrong so that e.g. a uniprocessor
machine gets an extra CPU allocated and never switches to single CPU
mode.
And sometimes CPU hotplugging is enabled only for suspend/hibernate
anyway, so the additional CPU IDs are not wanted. Allow the number
to be set to zero at compile time.
Also, force the number of extra CPUs to zero if hotplugging is disabled
which allows removing some conditional code.
Tested on uniprocessor x86_64 that ACPI claims has a disabled processor,
with CPU hotplugging configured.
("After" has the number of additional CPUs set to 0)
Before: NR_CPUS: 512, nr_cpu_ids: 2, nr_node_ids 1
After: NR_CPUS: 512, nr_cpu_ids: 1, nr_node_ids 1
[Changed the name of the option and the prompt according to Ingo's
suggestion.]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The flag_is_changeable_p() is used by
has_cpuid_p() which can return different results
in the code sequence below:
if (!have_cpuid_p())
identify_cpu_without_cpuid(c);
/* cyrix could have cpuid enabled via c_identify()*/
if (!have_cpuid_p())
return;
Otherwise, the gcc 3.4.6 optimizes these two calls
into one which make the code not working correctly.
Cyrix cpus have the CPUID instruction enabled before
the second call to the have_cpuid_p() but
it is not detected due to the gcc optimization.
Thus the ARR registers (mtrr like) are not detected
on such a cpu.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently the low-level function to dump user-passed registers on i386 is
called __show_registers() whereas on x86-64 it's called __show_regs(). Unify
the API to simplify porting of kmemcheck to x86-64.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add functions that use the infrastructure added by the x2apic code. These
functions were originally stubbed out since the UV code went into the
tree prior to the x2apic code.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Commit 4a701737 ("x86: move prefill_possible_map calling early, fix")
is the wrong fix: prefill_possible_map() needs to be available
even when CONFIG_HOTPLUG_CPU is not set. A followon patch will do that.
Fix this correctly by making prefill_possible_map() available even when
CONFIG_HOTPLUG_CPU is not set. The function is needed so that
the number of possible CPUs can be determined.
Tested on uniprocessor machine with CPU hotplug disabled.
From boot log:
Before: NR_CPUS: 512, nr_cpu_ids: 512, nr_node_ids 1
After: NR_CPUS: 512, nr_cpu_ids: 1, nr_node_ids 1
Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The Winchip-2 and Winchip-2A cpu choices select the
same options for kernel and compiler.
Merge them to save few bytes and reduce confusion.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix the size of the mappings of UV hub registers. Size must
be a function of the maximum node number within the SSI.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch hardcodes which traps should be forwarded to
handle_vm86_trap in do_trap. This allows to remove the
vm86 parameter from the i386-version of do_trap, which
makes the DO_VM86_ERROR and DO_VM86_ERROR_INFO macros
unnecessary.
x86_64 part is whitespace only.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The last use of trace_hardirqs_fixup is unnecessary, because the
trap is taken with interrupt off on i386 as well as x86_64, and
the irq-tracer is notified of this from the assembly code.
trace_hardirqs_fixup and trace_hardirqs_fixup_flags are removed
from include/asm-x86/irqflags.h as they are no longer used.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
All exceptions are taken via interrupt gates. TRACE_IRQS_OFF
is called just before entering the C code, so the irq state
is known to the irq tracer at this point. No need to call
trace_hardirqs_fixup.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
All exceptions are taken via interrupt gates. TRACE_IRQS_OFF
is called just before entering the C code, so the irq state
is known to the irq tracer at this point. No need to call
trace_hardirqs_fixup.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
All exceptions are taken via interrupt gates. TRACE_IRQS_OFF
is called just before entering the C code, so the irq state
is known to the irq tracer at this point. No need to call
trace_hardirqs_fixup.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add TRACE_IRQS_OFF just before entering the C code.
All exceptions are taken via interrupt gates. If irq tracing is
enabled, it should be notified as soon as possible. Interrupts
are only (conditionally) re-enabled in C code.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add TRACE_IRQS_OFF just before entering the C code.
All exceptions are taken via interrupt gates. If irq tracing is
enabled, it should be notified as soon as possible. Interrupts
are only (conditionally) re-enabled in C code.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Portions of the ACPI code needs to know if a system is a UV system prior
to genapic initialization. This patch adds a call early_acpi_boot_init()
so that the apic type is discovered earlier.
V2 of the patch adding fixes from Yinghai Lu.
Much cleaner and smaller.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Take it out of time initialization and move it to
cpu detection time.
Signed-off-by: Glauber Costa <glommer@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Only test for MCA_bus if support for MCA is compiled in.
Also, for x86_64, write the code inside the conditional
for consistency with i386. It won't bite us, since it'll
probably never select CONFIG_MCA anyway.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Replace "4" in time_32.c code by sizeof(long).
This way, it can work on x86_64 too.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
For x86_64, it does not really matter. But makes the
code equal to i386.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Save rbp twice: One is for marking the stack frame, as usual (already
there), and the other, to fill pt_regs properly. This is because bx
comes right before the last saved register in that structure, and not
bp. If the base pointer were in the place bx is today, this would not
be needed.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Coalesce v8086_mode and user_mode into a single
user_mode_vm() test.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Instead of using SEGMENT_IS_KERNEL_CODE, use the
"user_mode" macro, which can play the same role. Delete
the former, since it now lacks any user.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Some CPUs have vendor string in the middle of model_id instead of beginning
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Since pte_flags() is much cheaper than pte_val() in some virtualized
environments (namely, Xen), use the former whereever possible.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: "Nick Piggin" <npiggin@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When nr_range gets decremented, the same slot must be considered for
coalescing with its new successor again.
The issue is apparently pretty benign to native code, but surfaces as a
boot time crash in our forward ported Xen tree (where the page table
setup overall works differently than in native).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We should check if we have ESR register before reading from it.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: "Maciej W. Rozycki" <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The check prevents flags on mappings from being changed, which is not
desireable. There's no need to check for replacing a mapping, and
x86-32 does not do this check.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The remappings in setup.c are all just ordinary memory, so use
early_memremap() rather than early_ioremap().
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
early_ioremap() is also used to map normal memory when constructing
the linear memory mapping. However, since we sometimes need to be able
to distinguish between actual IO mappings and normal memory mappings,
add a early_memremap() call, which maps with PAGE_KERNEL (as opposed
to PAGE_KERNEL_IO for early_ioremap()), and use it when constructing
pagetables.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use one of the software-defined PTE bits to indicate that a mapping is
intended for an IO address. On native hardware this is irrelevent,
since a physical address is a physical address. But in a virtual
environment, physical addresses are also virtualized, so there needs
to be some way to distinguish between pseudo-physical addresses and
actual hardware addresses; _PAGE_IOMAP indicates this intent.
By default, __supported_pte_mask masks out _PAGE_IOMAP, so it doesn't
even appear in the final pagetable.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The exception handlers in entry_32.S should now all call
TRACE_IRQS_OFF before calling the C code. The calls to
trace_hardirqs_fixup should now be unnecessary. Remove them.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
At this point interrupts are off, so let's inform the tracing
code of that fact before calling into C.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
At this point interrupts are off, so let's inform the tracing
code of that fact before calling into C.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
At this point interrupts are off, so let's inform the tracing
code of that fact before calling into C.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Many exceptions use the same code path via the label 'error_code'
in entry_32.S. At this point interrupts are off, so let's inform
the tracing code of that fact before calling into C.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Only one use of the DO_TRAP macros remains. Expand that one and
remove the macros now.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Handle general protection exception with interrupt initially off.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Handle segment not present exception with interrupt initially off.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Handle no coprocessor exception with interrupt initially off.
device_not_available in entry_32.S calls either math_state_restore
or math_emulate. This patch adds an extra indirection to be
able to re-enable interrupts explicitly in traps_32.c
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The int3 exception was already takes as an interrupt and
do_int3 does not fit in the new DO_ERROR macro. This patch
just expands the DO_TRAP macro and rearranges the code a
bit.
No functional changes intended.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There is some macro magic in traps_32.c to construct standard
exception dispatch functions. This patch renames the DO_ERROR-
like macros to DO_TRAP, and introduces new DO_ERROR ones that
conditionally reenable interrupts explicitly, like x86_64.
No code changes.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86_64 uses a helper function conditional_sti in traps_64.c which
is equal to restore_interrupts in kprobes.h. The only user of
restore_interrupts is in traps_32.c. Introduce conditional_sti
for i386 and remove restore_interrupts.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Secondary cpus start with local interrupts disabled.
start_secondary() first initializes the new cpu, then it enables the
local interrupts. (although interrupts are enabled within smp_callin()
as well).
Right now, the local interrupts are enabled as a side effect of calling
ipi_call_lock_irq().
The attached patch clarifies when local interrupts are enabled.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Avoid updating registers or memory twice as well as needlessly loading
or copying registers.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
deselecting one of the CPU type CONFIG_CPU_SUP_* config options
can render a kernel unbootable. Make sure this option is only
available if CONFIG_EMBEDDED is enabled.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
extend the help text of the CONFIG_CPU_SUP_* config options to
express what it does and what effects it has.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
propagate an error in enabling the PCI device.
Also eliminates this warning:
arch/x86/kernel/amd_iommu_init.c: In function ‘init_iommu_one’:
arch/x86/kernel/amd_iommu_init.c:726: warning: ignoring return value of ‘pci_enable_device’, declared with attribute warn_unused_result
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fix:
arch/x86/kernel/io_apic_64.c: In function ‘print_local_APIC’:
arch/x86/kernel/io_apic_64.c:1284: warning: format ‘%08x’ expects type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
arch/x86/kernel/io_apic_64.c:1285: warning: format ‘%08x’ expects type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
We want to print the two halves of 'icr' at 32 bit width.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fix warning:
arch/x86/kernel/early_printk.c:993: warning: ‘enable_debug_console’ defined but not used
Eliminate dead code.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fix warning:
arch/x86/kernel/xsave.c: In function ‘save_i387_xstate’:
arch/x86/kernel/xsave.c:98: warning: ignoring return value of ‘__clear_user’, declared with attribute warn_unused_result
check the return value and act on it. We should not be ignoring faults
at this point.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Move the prototypes from the generic kernel.h header to the more
appropriate include/asm-x86/bios_ebda.h header file.
Also, remove the check from the power management code - this is a
pure x86 matter for now.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This adds a user_regset type for the x86 io permissions bitmap.
This makes it appear in core dumps (when ioperm has been used).
It will also make it visible to debuggers in the future.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
[conflict resolutions: Signed-off-by: Ingo Molnar <mingo@elte.hu> ]
Do not crash when enumerating supported CPU architectures
SECURITY_INIT somehow ended up in x86_cpu_dev.init section. That caused printk
in code which prints supported architectures to hit #GP due to non-canonical
address being used.
Signed-off-by: Petr Vandrovec <petr@vandrovec.name>
Cc: thomas.petazzoni@free-electrons.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The x86 implementation of early_ioremap has an off by one error. If we get
an object which ends on the first byte of a page we undermap by one page and
this causes a crash on boot with the ASUS P5QL whose DMI table happens to fit
this alignment.
The size computation is currently
last_addr = phys_addr + size - 1;
npages = (PAGE_ALIGN(last_addr) - phys_addr)
(Consider a request for 1 byte at alignment 0...)
Closes#11693
Debugging work by Ian Campbell/Felix Geyer
Signed-off-by: Alan Cox <alan@rehat.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
It's possible for get_wchan() to dereference past task->stack + THREAD_SIZE
while iterating through instruction pointers if fp equals the upper boundary,
causing a kernel panic.
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'x86-v28-for-linus-phase4-D' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (186 commits)
x86, debug: print more information about unknown CPUs
x86 setup: handle more than 8 CPU flag words
x86: cpuid, fix typo
x86: move transmeta cap read to early_init_transmeta()
x86: identify_cpu_without_cpuid v2
x86: extended "flags" to show virtualization HW feature in /proc/cpuinfo
x86: move VMX MSRs to msr-index.h
x86: centaur_64.c remove duplicated setting of CONSTANT_TSC
x86: intel.c put workaround for old cpus together
x86: let intel 64-bit use intel.c
x86: make intel_64.c the same as intel.c
x86: make intel.c have 64-bit support code
x86: little clean up of intel.c/intel_64.c
x86: make 64 bit to use amd.c
x86: make amd_64 have 32 bit code
x86: make amd.c have 64bit support code
x86: merge header in amd_64.c
x86: add srat_detect_node for amd64
x86: remove duplicated force_mwait
x86: cpu make amd.c more like amd_64.c v2
...
* 'x86-v28-for-linus-phase3-B' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (74 commits)
AMD IOMMU: use iommu_device_max_index, fix
AMD IOMMU: use iommu_device_max_index
x86: add PCI IDs for AMD Barcelona PCI devices
x86/iommu: use __GFP_ZERO instead of memset for GART
x86/iommu: convert GART need_flush to bool
x86/iommu: make GART driver checkpatch clean
x86 gart: remove unnecessary initialization
x86: restore old GART alloc_coherent behavior
revert "x86: make GART to respect device's dma_mask about virtual mappings"
x86: export pci-nommu's alloc_coherent
iommu: remove fullflush and nofullflush in IOMMU generic option
x86: remove set_bit_string()
iommu: export iommu_area_reserve helper function
AMD IOMMU: use coherent_dma_mask in alloc_coherent
add AMD IOMMU tree to MAINTAINERS file
AMD IOMMU: use cmd_buf_size when freeing the command buffer
AMD IOMMU: calculate IVHD size with a function
AMD IOMMU: remove unnecessary cast to u64 in the init code
AMD IOMMU: free domain bitmap with its allocation order
AMD IOMMU: simplify dma_mask_to_pages
...
* 'x86-v28-for-linus-phase2-B' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (27 commits)
x86, cpa: make the kernel physical mapping initialization a two pass sequence, fix
x86, pat: cleanups
x86: fix pagetable init 64-bit breakage
x86: track memtype for RAM in page struct
x86, cpa: srlz cpa(), global flush tlb after splitting big page and before doing cpa
x86, cpa: remove cpa pool code
x86, cpa: no need to check alias for __set_pages_p/__set_pages_np
x86, cpa: dont use large pages for kernel identity mapping with DEBUG_PAGEALLOC
x86, cpa: make the kernel physical mapping initialization a two pass sequence
x86, cpa: remove USER permission from the very early identity mapping attribute
x86, cpa: rename PTE attribute macros for kernel direct mapping in early boot
x86: make sure the CPA test code's use of _PAGE_UNUSED1 is obvious
linux-next: fix x86 tree build failure
x86: have set_memory_array_{uc,wb} coalesce memtypes, fix
agp: enable optimized agp_alloc_pages methods
x86: have set_memory_array_{uc,wb} coalesce memtypes.
x86: {reverve,free}_memtype() take a physical address
x86: fix pageattr-test
agp: add agp_generic_destroy_pages()
agp: generic_alloc_pages()
...
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] Fix BUG: using smp_processor_id() in preemptible code
[CPUFREQ] Don't export governors for default governor
[CPUFREQ][6/6] cpufreq: Add idle microaccounting in ondemand governor
[CPUFREQ][5/6] cpufreq: Changes to get_cpu_idle_time_us(), used by ondemand governor
[CPUFREQ][4/6] cpufreq_ondemand: Parameterize down differential
[CPUFREQ][3/6] cpufreq: get_cpu_idle_time() changes in ondemand for idle-microaccounting
[CPUFREQ][2/6] cpufreq: Change load calculation in ondemand for software coordination
[CPUFREQ][1/6] cpufreq: Add cpu number parameter to __cpufreq_driver_getavg()
[CPUFREQ] use deferrable delayed work init in conservative governor
[CPUFREQ] drivers/cpufreq/cpufreq.c: Adjust error handling code involving cpufreq_cpu_put
[CPUFREQ] add error handling for cpufreq_register_governor() error
[CPUFREQ] acpi-cpufreq: add error handling for cpufreq_register_driver() error
[CPUFREQ] Coding style fixes to arch/x86/kernel/cpu/cpufreq/powernow-k6.c
[CPUFREQ] Coding style fixes to arch/x86/kernel/cpu/cpufreq/elanfreq.c
* 'sched-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (38 commits)
sched debug: add name to sched_domain sysctl entries
sched: sync wakeups vs avg_overlap
sched: remove redundant code in cpu_cgroup_create()
sched_rt.c: resch needed in rt_rq_enqueue() for the root rt_rq
cpusets: scan_for_empty_cpusets(), cpuset doesn't seem to be so const
sched: minor optimizations in wake_affine and select_task_rq_fair
sched: maintain only task entities in cfs_rq->tasks list
sched: fixup buddy selection
sched: more sanity checks on the bandwidth settings
sched: add some comments to the bandwidth code
sched: fixlet for group load balance
sched: rework wakeup preemption
CFS scheduler: documentation about scheduling policies
sched: clarify ifdef tangle
sched: fix list traversal to use _rcu variant
sched: turn off WAKEUP_OVERLAP
sched: wakeup preempt when small overlap
kernel/cpu.c: create a CPU_STARTING cpu_chain notifier
kernel/cpu.c: Move the CPU_DYING notifiers
sched: fix __load_balance_iterator() for cfq with only one task
...
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: skcipher - Use RNG interface instead of get_random_bytes
crypto: rng - RNG interface and implementation
crypto: api - Add fips_enable flag
crypto: skcipher - Move IV generators into their own modules
crypto: cryptomgr - Test ciphers using ECB
crypto: api - Use test infrastructure
crypto: cryptomgr - Add test infrastructure
crypto: tcrypt - Add alg_test interface
crypto: tcrypt - Abort and only log if there is an error
crypto: crc32c - Use Intel CRC32 instruction
crypto: tcrypt - Avoid using contiguous pages
crypto: api - Display larval objects properly
crypto: api - Export crypto_alg_lookup instead of __crypto_alg_lookup
crypto: Kconfig - Replace leading spaces with tabs
Jeremy Fitzhardinge wrote:
> I'd noticed that current tip/master hasn't been booting under Xen, and I
> just got around to bisecting it down to this change.
>
> commit 065ae73c5462d42e9761afb76f2b52965ff45bd6
> Author: Suresh Siddha <suresh.b.siddha@intel.com>
>
> x86, cpa: make the kernel physical mapping initialization a two pass sequence
>
> This patch is causing Xen to fail various pagetable updates because it
> ends up remapping pagetables to RW, which Xen explicitly prohibits (as
> that would allow guests to make arbitrary changes to pagetables, rather
> than have them mediated by the hypervisor).
Instead of making init a two pass sequence, to satisfy the Intel's TLB
Application note (developer.intel.com/design/processor/applnots/317080.pdf
Section 6 page 26), we preserve the original page permissions
when fragmenting the large mappings and don't touch the existing memory
mapping (which satisfies Xen's requirements).
Only open issue is: on a native linux kernel, we will go back to mapping
the first 0-1GB kernel identity mapping as executable (because of the
static mapping setup in head_64.S). We can fix this in a different
patch if needed.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Track the memtype for RAM pages in page struct instead of using the
memtype list. This avoids the explosion in the number of entries in
memtype list (of the order of 20,000 with AGP) and makes the PAT
tracking simpler.
We are using PG_arch_1 bit in page->flags.
We still use the memtype list for non RAM pages.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Do a global flush tlb after splitting the large page and before we do the
actual change page attribute in the PTE.
With out this, we violate the TLB application note, which says
"The TLBs may contain both ordinary and large-page translations for
a 4-KByte range of linear addresses. This may occur if software
modifies the paging structures so that the page size used for the
address range changes. If the two translations differ with respect
to page frame or attributes (e.g., permissions), processor behavior
is undefined and may be implementation-specific."
And also serialize cpa() (for !DEBUG_PAGEALLOC which uses large identity
mappings) using cpa_lock. So that we don't allow any other cpu, with stale
large tlb entries change the page attribute in parallel to some other cpu
splitting a large page entry along with changing the attribute.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: arjan@linux.intel.com
Cc: venkatesh.pallipadi@intel.com
Cc: jeremy@goop.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Interrupt context no longer splits large page in cpa(). So we can do away
with cpa memory pool code.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: arjan@linux.intel.com
Cc: venkatesh.pallipadi@intel.com
Cc: jeremy@goop.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
No alias checking needed for setting present/not-present mapping. Otherwise,
we may need to break large pages for 64-bit kernel text mappings (this adds to
complexity if we want to do this from atomic context especially, for ex:
with CONFIG_DEBUG_PAGEALLOC). Let's keep it simple!
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: arjan@linux.intel.com
Cc: venkatesh.pallipadi@intel.com
Cc: jeremy@goop.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Don't use large pages for kernel identity mapping with DEBUG_PAGEALLOC.
This will remove the need to split the large page for the
allocated kernel page in the interrupt context.
This will simplify cpa code(as we don't do the split any more from the
interrupt context). cpa code simplication in the subsequent patches.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: arjan@linux.intel.com
Cc: venkatesh.pallipadi@intel.com
Cc: jeremy@goop.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In the first pass, kernel physical mapping will be setup using large or
small pages but uses the same PTE attributes as that of the early
PTE attributes setup by early boot code in head_[32|64].S
After flushing TLB's, we go through the second pass, which setups the
direct mapped PTE's with the appropriate attributes (like NX, GLOBAL etc)
which are runtime detectable.
This two pass mechanism conforms to the TLB app note which says:
"Software should not write to a paging-structure entry in a way that would
change, for any linear address, both the page size and either the page frame
or attributes."
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: arjan@linux.intel.com
Cc: venkatesh.pallipadi@intel.com
Cc: jeremy@goop.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This merges phase 1 of the x86 tree, which is a collection of branches:
x86/alternatives, x86/cleanups, x86/commandline, x86/crashdump,
x86/debug, x86/defconfig, x86/doc, x86/exports, x86/fpu, x86/gart,
x86/idle, x86/mm, x86/mtrr, x86/nmi-watchdog, x86/oprofile,
x86/paravirt, x86/reboot, x86/sparse-fixes, x86/tsc, x86/urgent and
x86/vmalloc
and as Ingo says: "these are the easiest, purely independent x86 topics
with no conflicts, in one nice Octopus merge".
* 'x86-v28-for-linus-phase1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (147 commits)
x86: mtrr_cleanup: treat WRPROT as UNCACHEABLE
x86: mtrr_cleanup: first 1M may be covered in var mtrrs
x86: mtrr_cleanup: print out correct type v2
x86: trivial printk fix in efi.c
x86, debug: mtrr_cleanup print out var mtrr before change it
x86: mtrr_cleanup try gran_size to less than 1M, v3
x86: mtrr_cleanup try gran_size to less than 1M, cleanup
x86: change MTRR_SANITIZER to def_bool y
x86, debug printouts: IOMMU setup failures should not be KERN_ERR
x86: export set_memory_ro and set_memory_rw
x86: mtrr_cleanup try gran_size to less than 1M
x86: mtrr_cleanup prepare to make gran_size to less 1M
x86: mtrr_cleanup safe to get more spare regs now
x86_64: be less annoying on boot, v2
x86: mtrr_cleanup hole size should be less than half of chunk_size, v2
x86: add mtrr_cleanup_debug command line
x86: mtrr_cleanup optimization, v2
x86: don't need to go to chunksize to 4G
x86_64: be less annoying on boot
x86, olpc: fix endian bug in openfirmware workaround
...
Write the name of the unknown vendor_id to output instead of just
"unknown".
Tag changed to 'vendor_id' as used in /proc/cpuinfo
Signed-off-by: Hans Schou <linux@schou.dk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When reserving space for the hypervisor the Xen paravirt backend adds
an extra two pages (this was carried forward from the 2.6.18-xen tree
which had them "for safety"). Depending on various CONFIG options this
can cause the boot time fixmaps to span multiple PMDs which is not
supported and triggers a WARN in early_ioremap_init().
This was exposed by 2216d199b1 which
moved the dmi table parsing earlier.
x86: fix CONFIG_X86_RESERVE_LOW_64K=y
The bad_bios_dmi_table() quirk never triggered because we do DMI setup
too late. Move it a bit earlier.
There is no real reason to reserve these two extra pages and the
fixmap already incorporates FIX_HOLE which serves the same
purpose. None of the other callers of reserve_top_address do this.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add a cpu parameter to __cpufreq_driver_getavg(). This is needed for software
cpufreq coordination where policy->cpu may not be same as the CPU on which we
want to getavg frequency.
A follow-on patch will use this parameter to getavg freq from all cpus
in policy->cpus.
Change since last patch. Fix the offline/online and suspend/resume
oops reported by Youquan Song <youquan.song@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
add error handling for cpufreq_register_driver() error
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: cpufreq@lists.linux.org.uk
Signed-off-by: Dave Jones <davej@redhat.com>
Replace the no longer working links and email address in the
documentation and in source code.
Signed-off-by: Márton Németh <nm127@freemail.hu>
Signed-off-by: Dave Jones <davej@redhat.com>
When pinning/unpinning a pagetable with split pte locks, we can end up
holding multiple pte locks at once (we need to hold the locks while
there's a pending batched hypercall affecting the pte page). Because
all the pte locks are in the same lock class, lockdep thinks that
we're potentially taking a lock recursively.
This warning is spurious because we always take the pte locks while
holding mm->page_table_lock. lockdep now has spin_lock_nest_lock to
express this kind of dominant lock use, so use it here so that lockdep
knows what's going on.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If a processor implementation discern that a processor state component is in
its initialized state, it may modify the corresponding bit in the
xsave header.xstate_bv as '0'. State in the memory layout setup by 'xsave'
will be consistent with the bit values in the header.
During signal handling, legacy applications may change the FP/SSE bits
in the sigcontext memory layout without touching the FP/SSE header bits
in the xsave header. So always set FP/SSE bits in the xsave header
while saving the sigcontext state to the user space. During signal return,
this will enable the kernel to capture any changes to the FP/SSE bits by the
legacy applications which don't touch xsave headers.
xsave aware apps can change the xstate_bv in the xsave header aswell
as change any contents in the memory layout. xrestor as part of sigreturn
will capture all the changes.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
This PCI ID based quick should be a full solution for the IRQ0 override
related slowdown problem on SB450 based systems:
33fb0e4: x86: SB450: skip IRQ0 override if it is not routed to INT2 of IOAPIC
Emit a warning in those cases where the DMI quirk triggers but
the PCI ID based quirk didnt.
If this warning does not trigger then we can phase out the DMI quirks.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On some HP nx6... laptops (e.g. nx6325) BIOS reports an IRQ0 override
but the SB450 chipset is configured such that timer interrupts goe to
INT0 of IOAPIC.
Check IRQ0 routing and if it is routed to INT0 of IOAPIC skip the
timer override.
[ This more generic PCI ID based quirk should alleviate the need for
dmi_ignore_irq0_timer_override DMI quirks. ]
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Acked-by: "Maciej W. Rozycki" <macro@linux-mips.org>
Tested-by: Dmitry Torokhov <dtor@mail.ru>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: gart iommu have direct mapping when agp is present too
Stress-testing KVM's latest NMI support with kgdbts inside an SMP guest,
I came across spurious unhandled NMIs while running the singlestep test.
Looking closer at the code path each NMI takes when KGDB is enabled, I
noticed that kgdb_nmicallback is called twice per event: One time via
DIE_NMI_IPI notification, the second time on DIE_NMI. Removing the first
invocation cures the unhandled NMIs here.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
There is a bug in the BIOSes of some HP boxes with AMD Turions which
connects IO-APIC pins with ACPI thermal trip points in such a way that
if the state of the IO-APIC is not as expected by the (buggy) BIOS, the
thermal trip points are set to insanely low values (usually all of them
become 16 degrees Celsius). As a result, thermal throttling kicks in
and knock the system down to its shoes.
Unfortunately some of the recent IO-APIC changes made the bug show up.
To prevent this from happening, blacklist machines that are known to be
affected (nx6115 and 6715b in this particular case).
This fixes http://bugzilla.kernel.org/show_bug.cgi?id=11516 listed as
a regression from 2.6.26.
On my box it was caused by:
commit 691874fa96
Author: Maciej W. Rozycki <macro@linux-mips.org>
Date: Tue May 27 21:19:51 2008 +0100
x86: I/O APIC: timer through 8259A second-chance
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
and the whole story is described in this (huge) thread:
http://marc.info/?l=linux-kernel&m=121358440508410&w=4
Matthew Garrett told us about that happening on the nx6125:
http://marc.info/?l=linux-kernel&m=121396307411930&w=4
and then Maciej analysed the breakage on the basis of a DSDT from the
nx6325:
http://marc.info/?l=linux-kernel&m=121401068718826&w=4
As far as the Dmitry's and Jason's boxes are concerned, I recognized the
symptoms and asked them to verify that the blacklisting helped.
It appears that the buggy BIOS code has been copy-pasted to the entire
range of machines, for no good reason.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Tested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Tested-by: Jason Vas Dias <jason.vas.dias@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>