kernel_optimize_test/arch/blackfin/Kconfig.debug
Robin Getz b9a3899d59 Blackfin: make deferred hardware errors more exact
Hardware errors on the Blackfin architecture are queued by nature of the
hardware design.  Things that could generate a hardware level queue up at
the system interface and might not process until much later, at which
point the system would send a notification back to the core.

As such, it is possible for user space code to do something that would
trigger a hardware error, but have it delay long enough for the process
context to switch.  So when the hardware error does signal, we mistakenly
evaluate it as a different process or as kernel context and panic (erp!).
This makes it pretty difficult to find the offending context.  But wait,
there is good news somewhere.

By forcing a SSYNC in the interrupt entry, we force all pending queues at
the system level to be processed and all hardware errors to be signaled.
Then we check the current interrupt state to see if the hardware error is
now signaled.  If so, we re-queue the current interrupt and return thus
allowing the higher priority hardware error interrupt to process properly.
Since we haven't done any other context processing yet, the right context
will be selected and killed.  There is still the possibility that the
exact offending instruction will be unknown, but at least we'll have a
much better idea of where to look.

The downside of course is that this causes system-wide syncs at every
interrupt point which results in significant performance degradation.
Since this situation should not occur in any properly configured system
(as hardware errors are triggered by things like bad pointers), make it a
debug configuration option and disable it by default.

Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2009-06-12 06:11:44 -04:00

256 lines
8.7 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

menu "Kernel hacking"
source "lib/Kconfig.debug"
config DEBUG_STACKOVERFLOW
bool "Check for stack overflows"
depends on DEBUG_KERNEL
help
This option will cause messages to be printed if free stack space
drops below a certain limit.
config DEBUG_STACK_USAGE
bool "Enable stack utilization instrumentation"
depends on DEBUG_KERNEL
help
Enables the display of the minimum amount of free stack which each
task has ever had available in the sysrq-T output.
This option will slow down process creation somewhat.
config HAVE_ARCH_KGDB
def_bool y
config DEBUG_VERBOSE
bool "Verbose fault messages"
default y
select PRINTK
help
When a program crashes due to an exception, or the kernel detects
an internal error, the kernel can print a not so brief message
explaining what the problem was. This debugging information is
useful to developers and kernel hackers when tracking down problems,
but mostly meaningless to other people. This is always helpful for
debugging but serves no purpose on a production system.
Most people should say N here.
config DEBUG_MMRS
bool "Generate Blackfin MMR tree"
select DEBUG_FS
help
Create a tree of Blackfin MMRs via the debugfs tree. If
you enable this, you will find all MMRs laid out in the
/sys/kernel/debug/blackfin/ directory where you can read/write
MMRs directly from userspace. This is obviously just a debug
feature.
config DEBUG_HWERR
bool "Hardware error interrupt debugging"
depends on DEBUG_KERNEL
help
When enabled, the hardware error interrupt is never disabled, and
will happen immediately when an error condition occurs. This comes
at a slight cost in code size, but is necessary if you are getting
hardware error interrupts and need to know where they are coming
from.
config EXACT_HWERR
bool "Try to make Hardware errors exact"
depends on DEBUG_HWERR
help
By default, the Blackfin hardware errors are not exact - the error
be reported multiple cycles after the error happens. This delay
can cause the wrong application, or even the kernel to receive a
signal to be killed. If you are getting HW errors in your system,
try turning this on to ensure they are at least comming from the
proper thread.
On production systems, it is safe (and a small optimization) to say N.
config DEBUG_DOUBLEFAULT
bool "Debug Double Faults"
default n
help
If an exception is caused while executing code within the exception
handler, the NMI handler, the reset vector, or in emulator mode,
a double fault occurs. On the Blackfin, this is a unrecoverable
event. You have two options:
- RESET exactly when double fault occurs. The excepting
instruction address is stored in RETX, where the next kernel
boot will print it out.
- Print debug message. This is much more error prone, although
easier to handle. It is error prone since:
- The excepting instruction is not committed.
- All writebacks from the instruction are prevented.
- The generated exception is not taken.
- The EXCAUSE field is updated with an unrecoverable event
The only way to check this is to see if EXCAUSE contains the
unrecoverable event value at every exception return. By selecting
this option, you are skipping over the faulting instruction, and
hoping things stay together enough to print out a debug message.
This does add a little kernel code, but is the only method to debug
double faults - if unsure say "Y"
choice
prompt "Double Fault Failure Method"
default DEBUG_DOUBLEFAULT_PRINT
depends on DEBUG_DOUBLEFAULT
config DEBUG_DOUBLEFAULT_PRINT
bool "Print"
config DEBUG_DOUBLEFAULT_RESET
bool "Reset"
endchoice
config DEBUG_ICACHE_CHECK
bool "Check Instruction cache coherency"
depends on DEBUG_KERNEL
depends on DEBUG_HWERR
help
Say Y here if you are getting weird unexplained errors. This will
ensure that icache is what SDRAM says it should be by doing a
byte wise comparison between SDRAM and instruction cache. This
also relocates the irq_panic() function to L1 memory, (which is
un-cached).
config DEBUG_HUNT_FOR_ZERO
bool "Catch NULL pointer reads/writes"
default y
help
Say Y here to catch reads/writes to anywhere in the memory range
from 0x0000 - 0x0FFF (the first 4k) of memory. This is useful in
catching common programming errors such as NULL pointer dereferences.
Misbehaving applications will be killed (generate a SEGV) while the
kernel will trigger a panic.
Enabling this option will take up an extra entry in CPLB table.
Otherwise, there is no extra overhead.
config DEBUG_BFIN_HWTRACE_ON
bool "Turn on Blackfin's Hardware Trace"
default y
help
All Blackfins include a Trace Unit which stores a history of the last
16 changes in program flow taken by the program sequencer. The history
allows the user to recreate the program sequencers recent path. This
can be handy when an application dies - we print out the execution
path of how it got to the offending instruction.
By turning this off, you may save a tiny amount of power.
choice
prompt "Omit loop Tracing"
default DEBUG_BFIN_HWTRACE_COMPRESSION_OFF
depends on DEBUG_BFIN_HWTRACE_ON
help
The trace buffer can be configured to omit recording of changes in
program flow that match either the last entry or one of the last
two entries. Omitting one of these entries from the record prevents
the trace buffer from overflowing because of any sort of loop (for, do
while, etc) in the program.
Because zero-overhead Hardware loops are not recorded in the trace buffer,
this feature can be used to prevent trace overflow from loops that
are nested four deep.
config DEBUG_BFIN_HWTRACE_COMPRESSION_OFF
bool "Trace all Loops"
help
The trace buffer records all changes of flow
config DEBUG_BFIN_HWTRACE_COMPRESSION_ONE
bool "Compress single-level loops"
help
The trace buffer does not record single loops - helpful if trace
is spinning on a while or do loop.
config DEBUG_BFIN_HWTRACE_COMPRESSION_TWO
bool "Compress two-level loops"
help
The trace buffer does not record loops two levels deep. Helpful if
the trace is spinning in a nested loop
endchoice
config DEBUG_BFIN_HWTRACE_COMPRESSION
int
depends on DEBUG_BFIN_HWTRACE_ON
default 0 if DEBUG_BFIN_HWTRACE_COMPRESSION_OFF
default 1 if DEBUG_BFIN_HWTRACE_COMPRESSION_ONE
default 2 if DEBUG_BFIN_HWTRACE_COMPRESSION_TWO
config DEBUG_BFIN_HWTRACE_EXPAND
bool "Expand Trace Buffer greater than 16 entries"
depends on DEBUG_BFIN_HWTRACE_ON
default n
help
By selecting this option, every time the 16 hardware entries in
the Blackfin's HW Trace buffer are full, the kernel will move them
into a software buffer, for dumping when there is an issue. This
has a great impact on performance, (an interrupt every 16 change of
flows) and should normally be turned off, except in those nasty
debugging sessions
config DEBUG_BFIN_HWTRACE_EXPAND_LEN
int "Size of Trace buffer (in power of 2k)"
range 0 4
depends on DEBUG_BFIN_HWTRACE_EXPAND
default 1
help
This sets the size of the software buffer that the trace information
is kept in.
0 for (2^0) 1k, or 256 entries,
1 for (2^1) 2k, or 512 entries,
2 for (2^2) 4k, or 1024 entries,
3 for (2^3) 8k, or 2048 entries,
4 for (2^4) 16k, or 4096 entries
config DEBUG_BFIN_NO_KERN_HWTRACE
bool "Turn off hwtrace in CPLB handlers"
depends on DEBUG_BFIN_HWTRACE_ON
default y
help
The CPLB error handler contains a lot of flow changes which can
quickly fill up the hardware trace buffer. When debugging crashes,
the hardware trace may indicate that the problem lies in kernel
space when in reality an application is buggy.
Say Y here to disable hardware tracing in some known "jumpy" pieces
of code so that the trace buffer will extend further back.
config EARLY_PRINTK
bool "Early printk"
default n
select SERIAL_CORE_CONSOLE
help
This option enables special console drivers which allow the kernel
to print messages very early in the bootup process.
This is useful for kernel debugging when your machine crashes very
early before the console code is initialized. After enabling this
feature, you must add "earlyprintk=serial,uart0,57600" to the
command line (bootargs). It is safe to say Y here in all cases, as
all of this lives in the init section and is thrown away after the
kernel boots completely.
config CPLB_INFO
bool "Display the CPLB information"
help
Display the CPLB information via /proc/cplbinfo.
config ACCESS_CHECK
bool "Check the user pointer address"
default y
help
Usually the pointer transfer from user space is checked to see if its
address is in the kernel space.
Say N here to disable that check to improve the performance.
endmenu