also drop the NORETRY we can probably nearly always satisfy order 1 allocs now,
and again the vmalloc path is there.
Signed-off-by: Dave Airlie <airlied@redhat.com>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
New driver hooks for support graphics memory dma remapping
are introduced in this patch. It makes generic code can
tell if current device needs dma remapping, then call driver
provided interfaces for mapping and unmapping. Change has
also been made to handle scratch_page in remapping case.
Signed-off-by: Zhenyu Wang <zhenyu.z.wang@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
In commit 07613ba2 ("agp: switch AGP to use page array instead of
unsigned long array") we switched the mask_memory() method to take a
'struct page *' instead of an address. This is painful, because in some
cases it has to be an IOMMU-mapped virtual bus address (in fact,
shouldn't it _always_ be a dma_addr_t returned from pci_map_xxx(), and
we just happen to get lucky most of the time?)
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This switches AGP to use an array of pages for tracking the
pages allocated to the GART. This should enable GEM on PAE to work
a lot better as we can pass highmem pages to the PAT code and it will
do the right thing with them.
Signed-off-by: Dave Airlie <airlied@redhat.com>
AGP pages might be mapped into userspace finally, so the pages should be
set to zero before userspace can use it. Otherwise there is potential
information leakage.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
To reduce tlb/cache flush, makes agp memory allocation do one flush
after all pages in a region are changed to uc.
All agp drivers except agp-sgi uses agp_generic_alloc_page()
for .agp_alloc_page, so the patch should work for them. agp-sgi is only
for ia64, so not a problem too.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: airlied@linux.ie
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On my Intel chipset (965GM), the GTT is entirely erased across
suspend/resume. This patch simply re-plays the current mapping at resume
time to restore the table.=20
I noticed this once I started relying on persistent GTT mappings across VT
switch in our GEM work -- the old X server and DRM code carefully unbind
all memory from the GTT on VT switch, but GEM does not bother.
I placed the list management and rewrite code in the generic layer on the
assumption that it will be needed on other hardware, but I did not add the
rewrite call to anything other than the Intel resume function.
Keep a list of current GATT mappings. At resume time, rewrite them into
the GATT. This is needed on Intel (at least) as the entire GATT is
cleared across suspend/resume.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Keith Packard <keithp@keithp.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Convert printks to use dev_printk().
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
It's not even passed on to smp_call_function() anymore, since that
was removed. So kill it.
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
besides it apparently being useful only in 2.6.24 (the changes in 2.6.25
really mean that it could be converted back to a single-stage mechanism),
I'm seeing an issue in Xen Dom0 kernels, which is caused by the calling
of gart_to_virt() in the second stage invocations of the destroy function.
I think that besides this being a real issue with Xen (where
unmap_page_from_agp() is not just a page table attribute change), this
also is invalid from a theoretical perspective: One should not assume that
gart_to_virt() is still valid after unmapping a page. So minimally (keeping
the 2-stage mechanism) a patch like the one below would be needed.
Jan
Signed-off-by: Dave Airlie <airlied@redhat.com>
Several AGP drivers right now use ioremap_nocache() on kernel ram in order
to turn a page of regular memory uncached.
There are two problems with this:
1) This is a total nightmare for the ioremap() implementation to keep
various mappings of the same page coherent.
2) It's a total nightmare for the AGP code since it adds a ton of
complexity in terms of keeping track of 2 different pointers to
the same thing, in terms of error handling etc etc.
This patch fixes this by making the AGP drivers use the new
set_memory_XX APIs instead.
Note: amd-k7-agp.c is built on Alpha too, and generic.c is built
on ia64 as well, which do not yet have the set_memory_*() APIs,
so for them some we have a few ugly #ifdefs - hopefully they'll
be fixed soon.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Dave Airlie <airlied@linux.ie>
This bumps the AGP interface to 0.103.
Certain Intel chipsets contains a global write buffer, and this can require
flushing from the drm or X.org to make sure all data has hit RAM before
initiating a GPU transfer, due to a lack of coherency with the integrated
graphics device and this buffer.
This just adds generic support to the AGP interfaces, a follow-on patch
will add support to the Intel driver to use this interface.
Signed-off-by: Dave Airlie <airlied@redhat.com>
With Andi's clflush fixup, we were getting hangs on server exit, flushing the
mappings after freeing each page helped.
This showed up a race condition where the pages after being freed could be
reused before the agp mappings had been flushed. Flushing after each single
page is a bad thing for future drm work, so make the page destroy a two pass
unmapping all the pages, flushing the mappings, and then destroying the pages.
Signed-off-by: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
AGP should not need to lock pages. They are not protecting any race
because there is no lock_page calls, only SetPageLocked.
This is causing hangs with d00806b183.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
First thing mm.h does is including sched.h solely for can_do_mlock() inline
function which has "current" dereference inside. By dealing with can_do_mlock()
mm.h can be detached from sched.h which is good. See below, why.
This patch
a) removes unconditional inclusion of sched.h from mm.h
b) makes can_do_mlock() normal function in mm/mlock.c
c) exports can_do_mlock() to not break compilation
d) adds sched.h inclusions back to files that were getting it indirectly.
e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
getting them indirectly
Net result is:
a) mm.h users would get less code to open, read, preprocess, parse, ... if
they don't need sched.h
b) sched.h stops being dependency for significant number of files:
on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
after patch it's only 3744 (-8.3%).
Cross-compile tested on
all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
alpha alpha-up
arm
i386 i386-up i386-defconfig i386-allnoconfig
ia64 ia64-up
m68k
mips
parisc parisc-up
powerpc powerpc-up
s390 s390-up
sparc sparc-up
sparc64 sparc64-up
um-x86_64
x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig
as well as my two usual configs.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove an arch-dependent hunk in favor of #define-ing the respective bits in
asm-<arch>/agp.h (allowing easier overriding in para-virtualized environments).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Fix whitespace, braces, use kzalloc().
Cc: Dave Airlie <airlied@linux.ie>
Cc: Thomas Hellstrom <thomas@tungstengraphics.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Jones <davej@redhat.com>
This patch allows drm to populate an agpgart structure with pages of its own.
It's needed for the new drm memory manager which dynamically flips pages in and out of AGP.
The patch modifies the generic functions as well as the intel agp driver. The intel drm driver is
currently the only one supporting the new memory manager.
Other agp drivers may need some minor fixing up once they have a corresponding memory manager enabled drm driver.
AGP memory types >= AGP_USER_TYPES are not populated by the agpgart driver, but the drm is expected
to do that, as well as taking care of cache- and tlb flushing when needed.
It's not possible to request these types from user space using agpgart ioctls.
The Intel driver also gets a new memory type for pages that can be bound cached to the intel GTT.
Signed-off-by: Thomas Hellstrom <thomas@tungstengraphics.com>
Signed-off-by: Dave Jones <davej@redhat.com>
This patch is to speed up flipping of pages in and out of the AGP aperture as
needed by the new drm memory manager.
A number of global cache flushes are removed as well as some PCI posting flushes.
The following guidelines have been used:
1) Memory that is only mapped uncached and that has been subject to a global
cache flush after the mapping was changed to uncached does not need any more
cache flushes. Neither before binding to the aperture nor after unbinding.
2) Only do one PCI posting flush after a sequence of writes modifying page
entries in the GATT.
Signed-off-by: Thomas Hellstrom <thomas@tungstengraphics.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Not all graphic page remappers support physical addresses over the 4GB
mark for remapping, so while some do (the AMD64 GART always did, and I
just fixed the i965 to do so properly), we're safest off just forcing
GFP_DMA32 allocations to make sure graphics pages get allocated in the
low 32-bit address space by default.
AGP sub-drivers that really care, and can do better, could just choose
to implement their own allocator (or we could add another "64-bit safe"
default allocator for their use), but quite frankly, you're not likely
to care in practice.
So for now, this trivial change means that we won't be allocating pages
that we can't map correctly by mistake on x86-64.
[ On traditional 32-bit x86, this could never happen, because GFP_KERNEL
would never allocate any highmem memory anyway ]
Acked-by: Andi Kleen <ak@suse.de>
Acked-by: Dave Jones <davej@redhat.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Keith Packard <keithp@keithp.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some dumb bridges are programmed to disobey the AGP2 spec.
This is likely a BIOS misprogramming rather than poweron default, or
it would be a lot more common.
AGPv2 spec 6.1.9 states:
"The RATE field indicates the data transfer rates supported by this
device. A.G.P. devices must report all that apply."
Fix them up as best we can.
This will prevent errors like..
agpgart: Found an AGP 3.5 compliant device at 0000:00:00.0.
agpgart: req mode 1f000201 bridge_agpstat 1f000a14 vga_agpstat 2f000217.
agpgart: Device is in legacy mode, falling back to 2.x
agpgart: Putting AGP V2 device at 0000:00:00.0 into 0x mode
agpgart: Putting AGP V2 device at 0000:01:00.0 into 0x mode
agpgart: Putting AGP V2 device at 0000:01:00.1 into 0x mode
https://bugs.freedesktop.org/show_bug.cgi?id=8816
Signed-off-by: Dave Jones <davej@redhat.com>
Sometimes the logic to handle AGPx8->AGPx4 fallback failed, as can
be seen in https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=197346
The failures occured if the bridge was in AGPx8 mode, but the
user hadn't specified a mode in their X config. We weren't
setting the mode to the highest mode capable by the video card+bridge
(as we do in the AGPv2 case), which was leading to all kinds of
mayhem including us believing that after falling back from AGPx8, that
we couldn't do x4 mode (which is disastrous in AGPv3, as those are
the only two modes possible).
Signed-off-by: Dave Jones <davej@redhat.com>
AGP shouldn't use "global_flush_tlb()" to flush the AGP mappings, that i
spurely an x86'ism. The proper AGP mapping flusher that should be used
is "flush_agp_mappings()", which on x86 obviously happens to do a global
TLB flush.
This makes AGP (or at least the config _I_ happen to use) compile again
on ppc64.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
AGP allocation/deallocation is suffering major performance issues due to
the nature of global_flush_tlb() being called on every change_page_attr()
call.
For small allocations this isn't really seen, but when you start allocating
50000 pages of AGP space, for say, texture memory, then things can take
seconds to complete.
In some cases the situation is doubled or even quadrupled in the time due
to SMP, or a deallocation, then a new reallocation. I've had a case of
upto 20 seconds wait time to deallocate and reallocate AGP space.
This patch fixes the problem by making it the caller's responsibility to
call global_flush_tlb(), and so removes it from every instance of mapping a
page into AGP space until the time that all change_page_attr() changes are
done.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Spotted by Jeremy Fitzhardinge, this change crept in with the multiple
backend support. It's clearly incorrect to overwrite info->mode after
we just went to lengths to determine which bits to mask out.
Signed-off-by: Dave Jones <davej@redhat.com>
[AGPGART] Replace check_bridge_mode() with (bridge->mode & AGSTAT_MODE_3_0).
As mentioned earlier, the current check_bridge_mode() code assumes
that AGP bridges are PCI devices. This isn't always true. Definitely
not for HP zx1 chipset and the same seems to be the case for SGI's AGP
bridge.
The patch below fixes the problem by picking up the AGP_MODE_3_0 bit
from bridge->mode. I feel like I may be missing something, since I
can't see any reason why check_bridge_mode() wasn't doing that in the
first place. According to the AGP 3.0 specs, the AGP_MODE_3_0 bit is
determined during the hardware reset and cannot be changed, so it
seems to me it should be safe to pick it up from bridge->mode.
With the patch applied, I can definitely use AGP acceleration both
with AGP 2.0 and AGP 3.0 (one with an Nvidia card, the other with an
ATI FireGL card).
Unless someone spots a problem, please apply this patch so 3d
acceleration can work on zx1 boxes again.
This makes AGP work again on machines with an AGP bridge that isn't a
PCI device.
Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>
Signed-off-by: Dave Jones <davej@redhat.com>
When Linux is running on the Xen virtual machine monitor, physical
addresses are virtualised and cannot be directly referenced by the AGP
GART. This patch fixes the GART driver for Xen by adding a layer of
abstraction between physical addresses and 'GART addresses'.
Architecture-specific functions are also defined for allocating and freeing
the GATT. Xen requires this to ensure that table really is contiguous from
the point of view of the GART.
These extra interface functions are defined as 'no-ops' for all existing
architectures that use the GART driver.
Signed-off-by: Keir Fraser <keir@xensource.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!