Commit Graph

12625 Commits

Author SHA1 Message Date
Michael Ellerman
a7de7c7422 [POWERPC] MPIC MSI allocator
To support MSI on MPIC we need a way to reserve and allocate hardware irq
numbers, this patch implements an allocator for that purpose.

New firmware platforms must define a "msi-available-ranges" property on their
MPIC node for MSI to work. For U3/U4 we do a best-guess setup.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-08 13:43:48 +10:00
Michael Ellerman
812fd1fd63 [POWERPC] Enable MSI mappings for MPIC
On some Apple machines the HT MSI mappings are not enabled by firmware, so
we need to do it by hand.

We can't use the pci routines as this code runs too early.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-08 13:42:27 +10:00
Michael Ellerman
014dad902a [POWERPC] Tell Phyp we support MSI
Tell Phyp we support MSI via the client architecture support mechanism.

Signed-off-by: Jake Moilanen <moilanen@austin.ibm.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-08 13:40:31 +10:00
Michael Ellerman
85f2bf9f60 [POWERPC] RTAS MSI implementation
Implement MSI support via RTAS (RTAS = run-time firmware on pSeries
machines).  For now we assumes that if the required RTAS tokens for
MSI are present, then we want to use the RTAS MSI routines.

When RTAS is managing MSIs for us, it will/may enable MSI on devices that
support it by default. This is contrary to the Linux model where a device
is in LSI mode until the driver requests MSIs.

To remedy this we add a pci_irq_fixup call, which disables MSI if they've
been assigned by firmware and the device also supports LSI. Devices that
don't support LSI at all will be left as is, drivers are still expected
to call pci_enable_msi() before using the device.

At the moment there is no pci_irq_fixup on pSeries, so we can just set it
unconditionally. If other platforms use the RTAS MSI backend they'll need
to check that still holds.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-08 13:40:31 +10:00
Michael Ellerman
df87ef5508 [POWERPC] PowerPC MSI infrastructure
This provides the architecture specific hooks to support MSI on
powerpc.  We implement the newly added arch_setup_msi_irqs() and
arch_teardown_msi_irqs(), and then delegate to ppc_md routines.

Platforms that don't implement MSI will leave the ppc_md calls blank,
arch_msi_check_device() will detect this and return ENOSYS. Drivers
should detect this error and continue to use LSI.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-08 13:40:31 +10:00
Michael Ellerman
f728b5c3a5 [POWERPC] Rip out the existing powerpc msi stubs
Rip out the existing powerpc msi stubs. These were the start of an
implementation based on ppc_md calls, but were never used in mainline.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-08 13:40:31 +10:00
David Gibson
d1953c8888 [POWERPC] Remove use of 4level-fixup.h for ppc32
For 32-bit systems, powerpc still relies on the 4level-fixup.h hack,
to pretend that the generic pagetable handling stuff is 3-levels
rather than 4.  This patch removes this, instead using the newer
pgtable-nopmd.h to handle the elision of both the pud and pmd
pagetable levels (ppc32 pagetables are actually 2 levels).

This removes a little extraneous code, and makes it more easily
compared to the 64-bit pagetable code.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-08 13:40:31 +10:00
Brian King
00c2ae35bd [POWERPC] Add powerpc PCI-E reset API implementation
Adds the pSeries platform implementation for a new PCI API
which can be used to issue various types of PCI-E reset,
including PCI-E warm reset and PCI-E hot reset. This is needed
for an ipr PCI-E adapter which does not properly implement BIST.
Running BIST on this adapter results in PCI-E errors. The only
reliable reset mechanism that exists on this hardware is PCI
Fundamental reset (warm reset).

Acked-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-08 13:40:31 +10:00
Paul Mackerras
02bbc0f09c Merge branch 'linux-2.6' 2007-05-08 13:37:51 +10:00
Josh Boyer
7487a2245b [POWERPC] Holly bootwrapper
Add Holly/Hickory bootwrapper

Signed-off-by: Stephen Winiecki <stevewin@us.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-08 11:54:21 +10:00
Josh Boyer
d4d19ec493 [POWERPC] Holly DTS
Add Holly DTS file

Signed-off-by: Stephen Winiecki <stevewin@us.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-08 11:54:20 +10:00
Josh Boyer
897fd81795 [POWERPC] Holly defconfig
Holly/Hickory defconfig

Signed-off-by: Stephen Winiecki <stevewin@us.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-08 11:54:20 +10:00
Josh Boyer
cb9e4d10c4 [POWERPC] Add support for 750CL Holly board
Add PowerPC 750 Holly/Hickory platform support

Signed-off-by: Stephen Winiecki <stevewin@us.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-08 11:54:20 +10:00
Josh Boyer
05ad6a9159 [POWERPC] Generalize tsi108 PCI setup
Generalize tsi108_setup_pci to take the config space physical address and
primary bus designator as a parameter.

Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-08 11:54:20 +10:00
Josh Boyer
c1b78d05b3 [POWERPC] Generalize tsi108 PHY types
Add a phy_type field to the tsi108 ethernet structures to indicate which PHY
is used on a board.  This is derived from the "compatible" property in the
ethernet-phy node of the device tree.  The default remains the MV88E PHY.

Also, convert the setup code to use of_get_mac_address instead of hard coding
a lookup for the "address" property in the ethernet node.

Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-08 11:54:20 +10:00
Josh Boyer
08390db07a [POWERPC] Add tsi108_pci.h for common PCI functions
Add a header file for the common PCI routines used for the TSI bridge

Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-08 11:54:20 +10:00
Linas Vepstas
fb39a96e23 [POWERPC] Export pcibios_remove_pci_devices
The pseries PCI hotplug code cannot build as a module, unless
the pcibios_remove_pci_devices function is exported.

Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
----
 arch/powerpc/platforms/pseries/pci_dlpar.c |    1 +
 1 file changed, 1 insertion(+)
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-08 11:54:19 +10:00
Michael Ellerman
0108d3fe3c [POWERPC] Add __init annotations to reserve_mem() and stabs_alloc()
reserve_mem() and stabs_alloc() are both called only from other __init
routines, so can be marked __init.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-08 11:54:19 +10:00
Paul Mackerras
11fbb00c67 [POWERPC] Cope with PCI host bridge I/O window not starting at 0
Currently our code to set up the data structures for a PCI host bridge
and create the mapping for its I/O window assumes that the window
starts at I/O port 0 on the PCI side.  If this is not true, we can end
up with I/O port numbers in the resources for PCI devices which will
cause an oops if a driver tries to access them via inb/outb etc.,
because there is no mapping for the corresponding addresses.

Normally the I/O window starts at 0, but there are some situations on
partitioned machines with a hypervisor where the window may not start
at 0.

This fixes the problem by allocating space for the range from 0 to the
end of the I/O window.  That is, hose->io_base_virt contains the
virtual address for I/O port 0 on the PCI bus, and thus the assumption
that hose->io_base_virt - pci_io_base is the offset between the
"global" I/O port numbers (those in the PCI device resources) and the
I/O port numbers on the PCI bus is maintained.

For PCI host bridges that are present at boot, we only map the portion
of that range that correspond to the bridge's I/O window.  For bridges
added after boot we ioremap the range from 0 to the end of the I/O
window, for now; in fact hot-added bridges should be using
reserve_phb_iospace() and __ioremap_explicit (so they get sensible
global port numbers), but we don't have the infrastructure yet to do
that (basically a free_phb_iospace() routine plus appropriate
locking).

Interestingly, this makes the two arms of the if statement in
get_bus_io_range do almost exactly the same thing; that function could
now be simplified in a further patch.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-08 11:54:19 +10:00
Linus Torvalds
a989705c4c Merge git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] update memory attribute aliasing documentation & test cases
  [IA64] fail mmaps that span areas with incompatible attributes
  [IA64] allow WB /sys/.../legacy_mem mmaps
  [IA64] make ioremap avoid unsupported attributes
  [IA64] rename ioremap variables to match i386
  [IA64] relax per-cpu TLB requirement to DTC
  [IA64] remove per-cpu ia64_phys_stacked_size_p8
  [IA64] Fix example error injection program
  [IA64] Itanium MC Error Injection Tool: pal_mc_error_inject() interface
  [IA64] Itanium MC Error Injection Tool: Makefile changes
  [IA64] Itanium MC Error Injection Tool: Driver sysfs interface
  [IA64] Itanium MC Error Injection Tool: Doc and sample application
  [IA64] Itanium MC Error Injection Tool: Kernel configuration
2007-05-07 12:34:57 -07:00
Linus Torvalds
ef93127e4c Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SERIAL] sunsu: Fix section mismatch warnings.
  [SPARC64]: pgtable_cache_init() should be __init.
  [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/prom.c
  [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/pci.c
  [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/console.c
  [MM]: sparse_init() should be __init.
  [SPARC64]: Update defconfig.
  [VIDEO]: Add Sun XVR-2500 framebuffer driver.
  [VIDEO]: Add Sun XVR-500 framebuffer driver.
  [SPARC64]: SUN4U PCI-E controller support.
  [SPARC]: Fix comment typo in smp4m_blackbox_current().
  [SCSI] SUNESP: sun_esp.c needs linux/delay.h

Fix up conflict in arch/sparc64/mm/init.c manually due to removal of
pgtable_cache_init() through the -mm patches (even though that patch was
also by David ;)

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:22:48 -07:00
Linus Torvalds
5b6b549822 Merge master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6: (38 commits)
  sh: R7785RP board updates.
  sh: Update r7780rp defconfig.
  sh: Add die chain notifiers.
  sh: Fix APM emulation on hp6xx.
  sh: Wire up more IRQs for SH7709.
  sh: Solution Engine 7722 board support.
  sh: Fix r7780rp build.
  sh: kdump support.
  sh: Move clock reporting to its own proc entry.
  sh: Solution Engine SH7705 board and CPU updates.
  serial: sh-sci: Fix module clock refcount for serial console.
  serial: sh-sci: Fix module clock refcounting.
  sh: SH7722 clock framework support.
  sh: hp6xx pata_platform support.
  sh: Obey CONFIG_HZ for HZ definition.
  sh: Fix fstatat64() syscall.
  sh: se7780 PCI support.
  sh: SH7780 Solution Engine board support.
  sh: Add a dummy SH-4 PCIC fixup.
  sh: Tidy up L-BOX area5 addresses.
  ...
2007-05-07 12:17:40 -07:00
Jean Delvare
0ddb16cfb0 xtensa: strlcpy is smart enough
strlcpy already accounts for the trailing zero in its length
computation, so there is no need to substract one to the buffer size.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:04 -07:00
john stultz
ee17b36fd0 v850: generic timekeeping conversion
Convert an arch that does not currently implement sub-jiffy timekeeping to
use the generic timekeeping code.

v850 looks like it has some intent to implement sub-jiffy timekeeping, so
it may not yet be appropriate to try to convert, but I figured I'd get the
maintainer's input and submit the patch for comment.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:04 -07:00
Paolo 'Blaisorblade' Giarrusso
c2f239d93e uml: fix prototypes
Declare strlcpy and strlcat more correctly.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:04 -07:00
Jeff Dike
b7ec15bd00 uml: virtualized time fix
With the current timekeeping, !CONFIG_UML_REAL_TIME_CLOCK has
inconsistent behavior.  Previously, gettimeofday could be (and was)
isolated from the clock ticking.  Now, it's not, so when
CONFIG_UML_REAL_TIME_CLOCK is disabled, gettimeofday must progress in
lockstep with the clock, making it fully virtual.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:04 -07:00
Jeff Dike
83ff7df5f1 uml: out of tmpfs space error clarification
It turns out that the message complaining about a lack of tmpfs space
on the host can be misunderstood as referring to the UML.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:04 -07:00
Jeff Dike
1e7371c1a1 uml: only flush areas covered by VMA
When doing a full address space flush, only look at areas covered by a VMA.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:04 -07:00
Jeff Dike
16dd07bc64 uml: more page fault path trimming
More trimming of the page fault path.

Permissions are passed around in a single int rather than one bit per
int.  The permission values are copied from libc so that they can be
passed to mmap and mprotect without any further conversion.

The register sets used by do_syscall_stub and copy_context_skas0 are
initialized once, at boot time, rather than once per call.

wait_stub_done checks whether it is getting the signals it expects by
comparing the wait status to a mask containing bits for the signals of
interest rather than comparing individually to the signal numbers.  It
also has one check for a wait failure instead of two.  The caller is
expected to do the initial continue of the stub.  This gets rid of an
argument and some logic.  The fname argument is gone, as that can be
had from a stack trace.

user_signal() is collapsed into userspace() as it is basically one or
two lines of code afterwards.

The physical memory remapping stuff is gone, as it is unused.

flush_tlb_page is inlined.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:04 -07:00
Jeff Dike
3ec704e666 uml: eliminate a piece of debugging code
I missed removing another piece of debugging in an earlier patch.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:04 -07:00
Jeff Dike
64f60841c0 uml: speed page fault path
Give the page fault code a specialized path.  There is only one page to look
at, so there's no point in going into the general page table walking code.
There's only going to be one host operation, so there are no opportunities for
merging.  So, we go straight to the pte we want, figure out what needs doing,
and do it.

While I was in here, I fixed the wart where the address passed to unmap was a
void *, but an unsigned long to map and protect.

This gives me just under 10% on a kernel build.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:04 -07:00
Jeff Dike
8603ec8148 uml: aIO deadlock avoidance
Allow deadlocks to be avoided in the AIO code by setting the pipe to the I/O
thread non-blocking.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:03 -07:00
Jeff Dike
a6ea4cceed uml: rename os_{read_write}_file_k back to os_{read_write}_file
Rename os_{read_write}_file_k back to os_{read_write}_file, delete
the originals and their bogus infrastructure, and fix all the callers.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:03 -07:00
Jeff Dike
a263672424 uml: remove debugging remnants
I accidentally left the remnants of some debugging in an earlier patch.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:03 -07:00
Jeff Dike
dc764e5087 uml: formatting fixes around os_{read_write}_file callers
Formatting fixes ahead of renaming os_{read_write}_file_k to
os_{read_write}_file and fixing all the callers.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:03 -07:00
Jeff Dike
fda83a99b2 uml: change remaining callers of os_{read_write}_file
Convert all remaining os_{read_write}_file users to use the simple
{read,write} wrappers, os_{read_write}_file_k.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:03 -07:00
Jeff Dike
77f6af778d uml: don't try to handle signals on initial process stack
Code running on the initial UML stack can't receive or process signals since
current must be valid when IRQs are handled, and there is no current for this
stack.

So, instead of using UML_LONGJMP and UML_SETJMP, which are careful to save and
restore signal state, and, as a side-effect, handle any deferred signals,
start_idle_thread must use the bare equivalents, which don't do anything with
signals.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:03 -07:00
Jeff Dike
63843c265f uml: dump core on panic
Dump core after a panic.  This will provide better debugging information than
is currently available.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:03 -07:00
Peter Zijlstra
990c55871b uml: fixup allocation in the ubd driver
Sanitise gfp flags; it actually is an atomic context, so drop the
GFP_KERNEL part.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:03 -07:00
Jeff Dike
2adcec2197 uml: send pointers instead of structures to I/O thread
Instead of writing entire structures between UML and the I/O thread, we send
pointers.  This cuts down on the amount of data being copied and possibly
allows more requests to be pending between the two.

This requires that the requests be kmalloced and freed instead of living on
the stack.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:03 -07:00
Jeff Dike
a0044bdf60 uml: batch I/O requests
Send as many I/O requests to the I/O thread as possible, even though it will
still only handle one at a time.  This provides an opportunity to reduce
latency by starting one request before the previous one has been finished in
the driver.

Request handling is somewhat modernized by requesting sg pieces of a request
and handling them separately, finishing off the entire request after all the
pieces are done.

When a request queue stalls, normally because its pipe to the I/O thread is
full, it is put on the restart list.  This list is processed by starting up
the queues on it whenever there is some indication that progress might be
possible again.  Currently, this happens in the driver interrupt routine.
Some requests have been finished, so there is likely to be room in the pipe
again.

This almost doubles throughput when copying data between devices, but made no
noticable difference on anything else I tried.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:03 -07:00
Jeff Dike
a61f334fd2 uml: convert libc layer to call read and write
This patch converts calls in the os layer to os_{read,write}_file to calls
directly to libc read() and write() where it is clear that the I/O buffer is
in the kernel.

We can do that here instead of calling os_{read,write}_file_k since we are in
libc code and can call libc directly.

With the change in the calls, error handling needs to be changed to refer to
errno directly rather than the return value of the call.

CATCH_EINTR wrappers were also added where needed.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:03 -07:00
Jeff Dike
ef0470c053 uml: tidy libc code
This patch lays some groundwork for the next one, which converts calls to
os_{read,write}_file into {read,write}, by doing some tidying in the affected
areas.

do_not_aio gets restructured to make the final result a bit cleaner.

There are also whitespace and other formatting fixes, fixes in error messages,
and a typo fix.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:03 -07:00
Jeff Dike
3d564047a5 uml: start fixing os_read_file and os_write_file
This patch starts the removal of a very old, very broken piece of code.  This
stems from the problem of passing a userspace buffer into read() or write() on
the host.  If that buffer had not yet been faulted in, read and write will
return -EFAULT.

To avoid this problem, the solution was to fault the buffer in before the
system call by touching the pages that hold the buffer by doing a copy-user of
a byte to each page.  This is obviously bogus, but it does usually work, in tt
mode, since the kernel and process are in the same address space and userspace
addresses can be accessed directly in the kernel.

In skas mode, where the kernel and process are in separate address spaces, it
is completely bogus because the userspace address, which is invalid in the
kernel, is passed into the system call instead of the corresponding physical
address, which would be valid.  Here, it appears that this code, on every host
read() or write(), tries to fault in a random process page.  This doesn't seem
to cause any correctness problems, but there is a performance impact.  This
patch, and the ones following, result in a 10-15% performance gain on a kernel
build.

This code can't be immediately tossed out because when it is, you can't log
in.  Apparently, there is some code in the console driver which depends on
this somehow.

However, we can start removing it by switching the code which does I/O using
kernel addresses to using plain read() and write().  This patch introduces
os_read_file_k and os_write_file_k for use with kernel buffers and converts
all call locations which use obvious kernel buffers to use them.  These
include I/O using buffers which are local variables which are on the stack or
kmalloc-ed.  Later patches will handle the less obvious cases, followed by a
mass conversion back to the original interface.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:03 -07:00
Jeff Dike
f9d6e5f83b uml: remove unused x86_64 code
It turns out that essentially none of the x86_64 bugs.c is needed.  So, we can
delete most of it.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:03 -07:00
Jeff Dike
7f0536f80c uml: speed up page table walking
The previous page table walking code was horribly inefficient.  This patch
replaces it with code taken from elsewhere in the kernel.

Forking from bash is now ~5% faster and page faults are handled ~10% faster.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:03 -07:00
Jeff Dike
f30c2c983e uml: dump registers on ptrace or wait failure
Provide a register dump if handle_trap fails.  Abstract out ptrace_dump_regs
since it now has two callers.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:02 -07:00
Jeff Dike
2e3f5251ac uml: drivers get release methods
Define release methods for the ubd and net drivers.  They contain as much of
the remove methods as make sense.  All error checking must have already been
done as well as anything else that might be holding a reference on the device
kobject.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:02 -07:00
Jeff Dike
d8839354a0 uml: delete HOST_FRAME_SIZE
HOST_FRAME_SIZE isn't used any more.  It has been replaced with MAX_REG_NR.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:02 -07:00
Jeff Dike
d973a77bdb uml: irq locking commentary
Locking commentary.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:02 -07:00