Commit Graph

10236 Commits

Author SHA1 Message Date
FUJITA Tomonori
c5dec1c303 block: convert bio_copy_user to bio_copy_user_iov
This patch enables bio_copy_user to take struct sg_iovec (renamed
bio_copy_user_iov). bio_copy_user uses bio_copy_user_iov internally as
bio_map_user uses bio_map_user_iov.

The major changes are:

- adds sg_iovec array to struct bio_map_data

- adds __bio_copy_iov that copy data between bio and
sg_iovec. bio_copy_user_iov and bio_uncopy_user use it.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-21 09:50:08 +02:00
Akinobu Mita
0a0c4114df cdrom: make unregister_cdrom() return void
Now unregister_cdrom() always returns 0.
Make it return void and update all callers that check the return value.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Adrian McMenamin <adrian@mcmen.demon.co.uk>
Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-21 09:50:08 +02:00
Akinobu Mita
7fd097d42b cdrom: use list_head for cdrom_device_info list
Use list_head for cdrom_device_info list instead of opencoded
singly list handling.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-21 09:50:08 +02:00
Dave Young
3f34d024c1 jiffies: add time_is_after_jiffies and others which compare with jiffies
Most of time_after like macros usages just compare jiffies and another number,
so here add some time_is_* macros for convenience.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-21 07:59:51 +02:00
Ivan Kokshaysky
884525655d PCI: clean up resource alignment management
Done per Linus' request and suggestions. Linus has explained that
better than I'll be able to explain:

On Thu, Mar 27, 2008 at 10:12:10AM -0700, Linus Torvalds wrote:
> Actually, before we go any further, there might be a less intrusive
> alternative: add just a couple of flags to the resource flags field (we
> still have something like 8 unused bits on 32-bit), and use those to
> implement a generic "resource_alignment()" routine.
>
> Two flags would do it:
>
>  - IORESOURCE_SIZEALIGN: size indicates alignment (regular PCI device
>    resources)
>
>  - IORESOURCE_STARTALIGN: start field is alignment (PCI bus resources
>    during probing)
>
> and then the case of both flags zero (or both bits set) would actually be
> "invalid", and we would also clear the IORESOURCE_STARTALIGN flag when we
> actually allocate the resource (so that we don't use the "start" field as
> alignment incorrectly when it no longer indicates alignment).
>
> That wouldn't be totally generic, but it would have the nice property of
> automatically at least add sanity checking for that whole "res->start has
> the odd meaning of 'alignment' during probing" and remove the need for a
> new field, and it would allow us to have a generic "resource_alignment()"
> routine that just gets a resource pointer.

Besides, I removed IORESOURCE_BUS_HAS_VGA flag which was unused for ages.

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20 21:47:08 -07:00
Ben Hutchings
94e6108803 PCI: Expose PCI VPD through sysfs
Vital Product Data (VPD) may be exposed by PCI devices in several
ways.  It is generally unsafe to read this information through the
existing interfaces to user-land because of stateful interfaces.

This adds:
- abstract operations for VPD access (struct pci_vpd_ops)
- VPD state information in struct pci_dev (struct pci_vpd)
- an implementation of the VPD access method specified in PCI 2.2
  (in access.c)
- a 'vpd' binary file in sysfs directories for PCI devices with VPD
  operations defined

It adds a probe for PCI 2.2 VPD in pci_scan_device() and release of
VPD state in pci_release_dev().

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20 21:47:07 -07:00
Bjorn Helgaas
842de40d93 PCI: add generic pci_enable_resources()
Each architecture has its own pcibios_enable_resources() implementation.
These differ in many minor ways that have nothing to do with actual
architectural differences.  Follow-on patches will make most arches
use this generic version instead.

This version is based on powerpc, which seemed most up-to-date.  The only
functional difference from the x86 version is that this uses "!r->parent"
to check for resource collisions instead of "!r->start && r->end".

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20 21:47:04 -07:00
Shaohua Li
7d715a6c1a PCI: add PCI Express ASPM support
PCI Express ASPM defines a protocol for PCI Express components in the D0
state to reduce Link power by placing their Links into a low power state
and instructing the other end of the Link to do likewise. This
capability allows hardware-autonomous, dynamic Link power reduction
beyond what is achievable by software-only controlled power management.
However, The device should be configured by software appropriately.
Enabling ASPM will save power, but will introduce device latency.

This patch adds ASPM support in Linux. It introduces a global policy for
ASPM, a sysfs file /sys/module/pcie_aspm/parameters/policy can control
it. The interface can be used as a boot option too. Currently we have
below setting:
        -default, BIOS default setting
        -powersave, highest power saving mode, enable all available ASPM
state and clock power management
        -performance, highest performance, disable ASPM and clock power
management
By default, the 'default' policy is used currently.

In my test, power difference between powersave mode and performance mode
is about 1.3w in a system with 3 PCIE links.

Note: some devices might not work well with aspm, either because chipset
issue or device issue. The patch provide API (pci_disable_link_state),
driver can disable ASPM for specific device.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20 21:47:03 -07:00
Adrian Bunk
21c6847406 PCI: #if 0 pci_cleanup_aer_correct_error_status()
#if 0 the no longer used pci_cleanup_aer_correct_error_status().

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20 21:47:02 -07:00
Greg Kroah-Hartman
5ff580c10e PCI: remove global list of PCI devices
This patch finally removes the global list of PCI devices.  We are
relying entirely on the list held in the driver core now, and do not
need a separate "shadow" list as no one uses it.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20 21:47:02 -07:00
Greg Kroah-Hartman
8a1bc9013a PCI: add is_added flag to struct pci_dev
This lets us check if the device is really added to the driver core or
not, which is what we need when walking some of the bus lists.  The flag
is there in anticipation of getting rid of the other PCI device list,
which is what we used to check in this situation.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20 21:47:00 -07:00
Greg Kroah-Hartman
95247b57ed PCI: clean up search.c a lot
This cleans up the search.c file, now using the pci list of devices that
are created for the driver core, instead of relying on our separate list
of devices.  It's better to use the functions already created for this
kind of thing, instead of rolling our own all the time.

This work is done in anticipation of getting rid of that second list of
pci devices all together.

And it ends up saving code, always a nice benefit.

This also removes one compiler warning for when CONFIG_PCI_LEGACY is
enabled as we no longer internally use the deprecated functions anymore.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20 21:46:54 -07:00
Greg Kroah-Hartman
34220909a2 PCI: remove pci_get_device_reverse
This removes the pci_get_device_reverse function as there should not be
any need to walk pci devices backwards anymore.  All users of this call
are now gone from the tree, so it is safe to remove it.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20 21:46:53 -07:00
Greg Kroah-Hartman
448432c4b8 PCI: remove pci_find_present
No one is using this function anymore for quite some time, so remove it.
Everyone calls pci_dev_present() instead anyway...

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20 21:46:52 -07:00
Adrian Bunk
2baad5f96b PCI: #if 0 pci_assign_resource_fixed()
An unused function that bloated the kernel only when CONFIG_EMBEDDED was
enabled...

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20 21:46:52 -07:00
Sebastian Siewior
c3715cb90f [CRYPTO] api: Make the crypto subsystem fully modular
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-04-21 10:19:23 +08:00
Randy Dunlap
f7d0e5a506 skbuff: fix missing kernel-doc notation
Add kernel-doc notation for ndisc_nodetype:

Warning(linux-2.6.25-git2//include/linux/skbuff.h:340): No description found for parameter 'ndisc_nodetype'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-20 16:06:22 -07:00
Tony Jones
ee959b00c3 SCSI: convert struct class_device to struct device
It's big, but there doesn't seem to be a way to split it up smaller...

Signed-off-by: Tony Jones <tonyj@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19 19:10:33 -07:00
Greg Kroah-Hartman
c4c66cf178 memstick: convert struct class_device to struct device
struct class_device is going away, struct device should be used instead.

Signed-off-by: Tony Jones <tonyj@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Cc: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19 19:10:29 -07:00
Rafael J. Wysocki
b844eba292 PM: Remove destroy_suspended_device()
After 2.6.24 there was a plan to make the PM core acquire all device
semaphores during a suspend/hibernation to protect itself from
concurrent operations involving device objects.  That proved to be
too heavy-handed and we found a better way to achieve the goal, but
before it happened, we had introduced the functions
device_pm_schedule_removal() and destroy_suspended_device() to allow
drivers to "safely" destroy a suspended device and we had adapted some
drivers to use them.  Now that these functions are no longer necessary,
it seems reasonable to remove them and modify their users to use the
normal device unregistration instead.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19 19:10:28 -07:00
Konrad Rzeszutek
138fe4e069 Firmware: add iSCSI iBFT Support
Add /sysfs/firmware/ibft/[initiator|targetX|ethernetX] directories along with
text properties which export the the iSCSI Boot Firmware Table (iBFT)
structure.

What is iSCSI Boot Firmware Table?  It is a mechanism for the iSCSI tools to
extract from the machine NICs the iSCSI connection information so that they
can automagically mount the iSCSI share/target.  Currently the iSCSI
information is hard-coded in the initrd.  The /sysfs entries are read-only
one-name-and-value fields.

The usual set of data exposed is:

# for a in `find /sys/firmware/ibft/ -type f -print`; do  echo -n "$a: ";  cat $a; done
/sys/firmware/ibft/target0/target-name: iqn.2007.com.intel-sbx44:storage-10gb
/sys/firmware/ibft/target0/nic-assoc: 0
/sys/firmware/ibft/target0/chap-type: 0
/sys/firmware/ibft/target0/lun: 00000000
/sys/firmware/ibft/target0/port: 3260
/sys/firmware/ibft/target0/ip-addr: 192.168.79.116
/sys/firmware/ibft/target0/flags: 3
/sys/firmware/ibft/target0/index: 0
/sys/firmware/ibft/ethernet0/mac: 00:11:25:9d:8b:01
/sys/firmware/ibft/ethernet0/vlan: 0
/sys/firmware/ibft/ethernet0/gateway: 192.168.79.254
/sys/firmware/ibft/ethernet0/origin: 0
/sys/firmware/ibft/ethernet0/subnet-mask: 255.255.252.0
/sys/firmware/ibft/ethernet0/ip-addr: 192.168.77.41
/sys/firmware/ibft/ethernet0/flags: 7
/sys/firmware/ibft/ethernet0/index: 0
/sys/firmware/ibft/initiator/initiator-name: iqn.2007-07.com:konrad.initiator
/sys/firmware/ibft/initiator/flags: 3
/sys/firmware/ibft/initiator/index: 0

For full details of the IBFT structure please take a look at:
ftp://ftp.software.ibm.com/systems/support/system_x_pdf/ibm_iscsi_boot_firmware_table_v1.02.pdf

[akpm@linux-foundation.org: fix build]
Signed-off-by: Konrad Rzeszutek <konradr@linux.vnet.ibm.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Peter Jones <pjones@redhat.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19 19:10:28 -07:00
Greg Kroah-Hartman
3f62e5700b Driver core: make device_is_registered() work for class devices
device_is_registered() can use the kobject value for this, so it will
now work with devices that are associated with only a class, not a bus
and a driver.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19 19:10:26 -07:00
Alan Stern
9a3df1f7de PM: Convert wakeup flag accessors to inline functions
This patch (as1058) improves the wakeup macros in include/linux/pm.h.
All but the trivial ones are converted to inline routines, which
requires moving them to a separate header file since they depend on
the definition of struct device.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19 19:10:26 -07:00
Alan Stern
d288e47c47 PM: Make wakeup flags available whenever CONFIG_PM is set
The various wakeup flags and their accessor macros in struct
dev_pm_info should be available whenever CONFIG_PM is enabled, not
just when CONFIG_PM_SLEEP is on.  Otherwise remote wakeup won't always
be configurable for runtime power management.  This patch (as1056b)
fixes the oversight.

David Brownell adds:
	More accurately, fixes the "regression" ... as noted sometime
	last summer, after 296699de6b
	introduced CONFIG_SUSPEND.  But that didn't make the regression
	list for that kernel, ergo the delay in fixing it.

[rjw: rebased]

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19 19:10:25 -07:00
Rafael J. Wysocki
58aca23226 PM: Handle device registrations during suspend/resume
Modify the PM core to protect its data structures, specifically the
dpm_active list, from being corrupted if a child of the currently
suspending device is registered concurrently with its ->suspend()
callback.  In that case, since the new device (the child) is added
to dpm_active after its parent, the PM core will attempt to
suspend it after the parent, which is wrong.

Introduce a new member of struct dev_pm_info, called 'sleeping',
and use it to check if the parent of the device being added to
dpm_active has been suspended, in which case the device registration
fails.  Also, use 'sleeping' for checking if the ordering of devices
on dpm_active is correct.

Introduce variable 'all_sleeping' that will be set to 'true' once all
devices have been suspended and make new device registrations fail
until 'all_sleeping' is reset to 'false', in order to avoid having
unsuspended devices around while the system is going into a sleep state.

Remove pm_sleep_rwsem which is not necessary any more.

Special thanks to Alan Stern for discussions and suggestions that
lead to the creation of this patch.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19 19:10:24 -07:00
David Rientjes
3612e06b2c sysfs: small header file cleanup for SYSFS=n
Convert sysfs_remove_bin_file() to have a return type of 'void' for
!CONFIG_SYSFS configurations.  Also removes unnecessary colons from empty
void functions.

Signed-off-by: David Rientjes <rientjes@google.com>
Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19 19:10:20 -07:00
Joe Perches
1429db83e2 driver core: Convert debug functions declared inline __attribute__((format (printf,x,y) to statement expression macros
When DEBUG is not defined, pr_debug and dev_dbg and some
other local debugging functions are specified as:

"inline __attribute__((format (printf, x, y)))"

This is done to validate printk arguments when not debugging.

Converting these functions to macros or statement expressions
"do { if (0) printk(fmt, ##arg); } while (0)"
or
"({ if (0) printk(fmt, ##arg); 0; })
makes at least gcc 4.2.2 produce smaller objects.

This has the additional benefit of allowing the optimizer to
avoid calling functions like print_mac that might have been
arguments to the printk.

defconfig x86 current:

$ size vmlinux
   text    data     bss     dec     hex filename
4716770  474560  618496 5809826  58a6a2 vmlinux

all converted: (More patches follow)

$ size vmlinux
   text    data     bss     dec     hex filename
4716642  474560  618496 5809698  58a622 vmlinux

Even kernel/sched.o, which doesn't even use these
functions, becomes smaller.

It appears that merely having an indirect include
of <linux/device.h> can cause bigger objects.

$ size sched.inline.o sched.if0.o
   text    data     bss     dec     hex filename
  31385    2854     328   34567    8707 sched.inline.o
  31366    2854     328   34548    86f4 sched.if0.o

The current preprocessed only kernel/sched.i file contains:

# 612 "include/linux/device.h"
static inline __attribute__((always_inline)) int __attribute__ ((format (printf, 2, 3)))
dev_dbg(struct device *dev, const char *fmt, ...)
{
 return 0;
}
# 628 "include/linux/device.h"
static inline __attribute__((always_inline)) int __attribute__ ((format (printf, 2, 3)))
dev_vdbg(struct device *dev, const char *fmt, ...)
{
 return 0;
}

Removing these unused inlines from sched.i shrinks sched.o

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19 19:10:19 -07:00
Daniel Walker
da19cbcf71 driver core: memory: semaphore to mutex
Signed-off-by: Daniel Walker <dwalker@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19 19:10:19 -07:00
Haavard Skinnemoen
e1c25dc638 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/usba-2.6.26 into base 2008-04-19 20:38:41 -04:00
Haavard Skinnemoen
03414e57ad Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/tclib into base 2008-04-19 20:38:13 -04:00
Adrian Bunk
a3dab29353 make nfs_automount_list static
nfs_automount_list can now become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:55:29 -04:00
Trond Myklebust
7c1d71cf56 SUNRPC: Don't disconnect more than once if retransmitting NFSv4 requests
NFSv4 requires us to ensure that we break the TCP connection before we're
allowed to retransmit a request. However in the case where we're
retransmitting several requests that have been sent on the same
connection, we need to ensure that we don't interfere with the attempt to
reconnect and/or break the connection again once it has been established.

We therefore introduce a 'connection' cookie that is bumped every time a
connection is broken. This allows requests to track if they need to force a
disconnection.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:55:12 -04:00
Trond Myklebust
7c67db3a8a NFSv4: Reintroduce machine creds
We need to try to ensure that we always use the same credentials whenever
we re-establish the clientid on the server. If not, the server won't
recognise that we're the same client, and so may not allow us to recover
state.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:54:56 -04:00
Trond Myklebust
78ea323be6 NFSv4: Don't use cred->cr_ops->cr_name in nfs4_proc_setclientid()
With the recent change to generic creds, we can no longer use
cred->cr_ops->cr_name to distinguish between RPCSEC_GSS principals and
AUTH_SYS/AUTH_NULL identities. Replace it with the rpc_authops->au_name
instead...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:54:53 -04:00
Trond Myklebust
5e7f37a76f NLM/lockd: Add a reference counter to struct nlm_rqst
When we replace the existing synchronous RPC calls with asynchronous calls,
the reference count will be needed in order to allow us to examine the
result of the RPC call.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:53:36 -04:00
Trond Myklebust
c1d519312d NFSv4: Only increment the sequence id if the server saw it
It is quite possible that the OPEN, CLOSE, LOCK, LOCKU,... compounds fail
before the actual stateful operation has been executed (for instance in the
PUTFH call). There is no way to tell from the overall status result which
operations were executed from the COMPOUND.

The fix is to move incrementing of the sequence id into the XDR layer,
so that we do it as we process the results from the stateful operation.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:53:15 -04:00
Trond Myklebust
c9d8f89d98 NFS: Ensure that the write code cleans up properly when rpc_run_task() fails
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:53:05 -04:00
Trond Myklebust
b6ddf64ffe SUNRPC: Fix up xprt_write_space()
The rest of the networking layer uses SOCK_ASYNC_NOSPACE to signal whether
or not we have someone waiting for buffer memory. Convert the SUNRPC layer
to use the same idiom.
Remove the unlikely()s in xs_udp_write_space and xs_tcp_write_space. In
fact, the most common case will be that there is nobody waiting for buffer
space.

SOCK_NOSPACE is there to tell the TCP layer whether or not the cwnd was
limited by the application window. Ensure that we follow the same idiom as
the rest of the networking layer here too.

Finally, ensure that we clear SOCK_ASYNC_NOSPACE once we wake up, so that
write_space() doesn't keep waking things up on xprt->pending.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:52:44 -04:00
Peter Zijlstra
4a55bd5e97 sched: fair-group: de-couple load-balancing from the rb-trees
De-couple load-balancing from the rb-trees, so that I can change their
organization.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19 19:45:00 +02:00
Peter Zijlstra
58d6c2d72f sched: rt-group: optimize dequeue_rt_stack
Now that the group hierarchy can have an arbitrary depth the O(n^2) nature
of RT task dequeues will really hurt. Optimize this by providing space to
store the tree path, so we can walk it the other way.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19 19:45:00 +02:00
Peter Zijlstra
18d95a2832 sched: fair-group: SMP-nice for group scheduling
Implement SMP nice support for the full group hierarchy.

On each load-balance action, compile a sched_domain wide view of the full
task_group tree. We compute the domain wide view when walking down the
hierarchy, and readjust the weights when walking back up.

After collecting and readjusting the domain wide view, we try to balance the
tasks within the task_groups. The current approach is a naively balance each
task group until we've moved the targeted amount of load.

Inspired by Srivatsa Vaddsgiri's previous code and Abhishek Chandra's H-SMP
paper.

XXX: there will be some numerical issues due to the limited nature of
     SCHED_LOAD_SCALE wrt to representing a task_groups influence on the
     total weight. When the tree is deep enough, or the task weight small
     enough, we'll run out of bits.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Abhishek Chandra <chandra@cs.umn.edu>
CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19 19:45:00 +02:00
Hidetoshi Seto
1d3504fcf5 sched, cpuset: customize sched domains, core
[rebased for sched-devel/latest]

 - Add a new cpuset file, having levels:
     sched_relax_domain_level

 - Modify partition_sched_domains() and build_sched_domains()
   to take attributes parameter passed from cpuset.

 - Fill newidle_idx for node domains which currently unused but
   might be required if sched_relax_domain_level become higher.

 - We can change the default level by boot option 'relax_domain_level='.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19 19:45:00 +02:00
Peter Zijlstra
eff766a65c sched: fix the task_group hierarchy for UID grouping
UID grouping doesn't actually have a task_group representing the root of
the task_group tree. Add one.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19 19:45:00 +02:00
Dhaval Giani
ec7dc8ac73 sched: allow the group scheduler to have multiple levels
This patch makes the group scheduler multi hierarchy aware.

[a.p.zijlstra@chello.nl: rt-parts and assorted fixes]
Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19 19:44:59 +02:00
Mike Travis
cd8ba7cd9b sched: add new set_cpus_allowed_ptr function
Add a new function that accepts a pointer to the "newly allowed cpus"
cpumask argument.

int set_cpus_allowed_ptr(struct task_struct *p, const cpumask_t *new_mask)

The current set_cpus_allowed() function is modified to use the above
but this does not result in an ABI change.  And with some compiler
optimization help, it may not introduce any additional overhead.

Additionally, to enforce the read only nature of the new_mask arg, the
"const" property is migrated to sub-functions called by set_cpus_allowed.
This silences compiler warnings.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19 19:44:59 +02:00
Mike Travis
9d1fe3236a cpumask: add show cpu map functions
* Add cpu_sysdev_class functions to display the following maps
    with cpulist_scnprintf().

	cpu_online_map
	cpu_present_map
	cpu_possible_map

  * Small change to include/linux/sysdev.h to allow the attribute
    name and label to be different (to avoid collision with the
    "attr_online" entry for bringing cpus on- and off-line.)

Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19 19:44:59 +02:00
Mike Travis
9f0e8d0400 x86: convert cpumask_of_cpu macro to allocated array
* Here is a simple patch to use an allocated array of cpumasks to
    represent cpumask_of_cpu() instead of constructing one on the stack.
    It's based on the Kconfig option "HAVE_CPUMASK_OF_CPU_MAP" which is
    currently only set for x86_64 SMP.  Otherwise the the existing
    cpumask_of_cpu() is used but has been changed to produce an lvalue
    so a pointer to it can be used.

Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19 19:44:59 +02:00
Mike Travis
321a8e9dcb cpumask: add CPU_MASK_ALL_PTR macro
* Add a static cpumask_t variable "CPU_MASK_ALL_PTR" to use as
    a pointer reference to CPU_MASK_ALL.  This reduces where possible
    the instances where CPU_MASK_ALL allocates and fills a large
    array on the stack.  Used only if NR_CPUS > BITS_PER_LONG.

  * Change init/main.c to use new set_cpus_allowed_ptr().

Depends on:
	[sched-devel]: sched: add new set_cpus_allowed_ptr function

Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19 19:44:59 +02:00
Mike Travis
7c16ec585c cpumask: reduce stack usage in SD_x_INIT initializers
* Remove empty cpumask_t (and all non-zero/non-null) variables
    in SD_*_INIT macros.  Use memset(0) to clear.  Also, don't
    inline the initializer functions to save on stack space in
    build_sched_domains().

  * Merge change to include/linux/topology.h that uses the new
    node_to_cpumask_ptr function in the nr_cpus_node macro into
    this patch.

Depends on:
	[mm-patch]: asm-generic-add-node_to_cpumask_ptr-macro.patch
	[sched-devel]: sched: add new set_cpus_allowed_ptr function

Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19 19:44:59 +02:00
Mike Travis
b53e921ba1 generic: reduce stack pressure in sched_affinity
* Modify sched_affinity functions to pass cpumask_t variables by reference
    instead of by value.

  * Use new set_cpus_allowed_ptr function.

Depends on:
	[sched-devel]: sched: add new set_cpus_allowed_ptr function

Cc: Paul Jackson <pj@sgi.com>
Cc: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19 19:44:59 +02:00
Mike Travis
f9a86fcbbb cpuset: modify cpuset_set_cpus_allowed to use cpumask pointer
* Modify cpuset_cpus_allowed to return the currently allowed cpuset
    via a pointer argument instead of as the function return value.

  * Use new set_cpus_allowed_ptr function.

  * Cleanup CPU_MASK_ALL and NODE_MASK_ALL uses.

Depends on:
	[sched-devel]: sched: add new set_cpus_allowed_ptr function

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19 19:44:58 +02:00
Mike Travis
30ca60c15a cpumask: add cpumask_scnprintf_len function
Add a new function cpumask_scnprintf_len() to return the number of
characters needed to display "len" cpumask bits.  The current method
of allocating NR_CPUS bytes is incorrect as what's really needed is
9 characters per 32-bit word of cpumask bits (8 hex digits plus the
seperator [','] or the terminating NULL.)  This function provides the
caller the means to allocate the correct string length.

Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19 19:44:58 +02:00
Peter Zijlstra
d0b27fa778 sched: rt-group: synchonised bandwidth period
Various SMP balancing algorithms require that the bandwidth period
run in sync.

Possible improvements are moving the rt_bandwidth thing into root_domain
and keeping a span per rt_bandwidth which marks throttled cpus.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19 19:44:57 +02:00
Ingo Molnar
57d3da2911 time: add ns_to_ktime()
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19 19:44:57 +02:00
Ingo Molnar
50df5d6aea sched: remove sysctl_sched_batch_wakeup_granularity
it's unused.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19 19:44:57 +02:00
Erik Bosman
8fb402bccf generic, x86: add prctl commands PR_GET_TSC and PR_SET_TSC
This patch adds prctl commands that make it possible
to deny the execution of timestamp counters in userspace.
If this is not implemented on a specific architecture,
prctl will return -EINVAL.

ned-off-by: Erik Bosman <ejbosman@cs.vu.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-19 19:19:55 +02:00
Harvey Harrison
a7d5ac87b2 x86: pageattr.c fix shadowed variable warning
irqs_disabled() uses flags internally, use _flags to avoid shadowing
code calling into this macro.

Introduced between 2.6.25-rc3 and -rc4

Fixes the sparse warning:
arch/x86/mm/pageattr.c:383:21: warning: symbol 'flags' shadows an earlier one
arch/x86/mm/pageattr.c:369:16: originally declared here

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-19 19:19:54 +02:00
Huang, Ying
4a3575fd43 x86: EFI_PAGE_SHIFT fix
Make x86 EFI code works when EFI_PAGE_SHIFT != PAGE_SHIFT. The
memrage_efi_to_native() provided in this patch can be used on other
EFI platform such as IA64 too.

This patch has been tested on Intel x86_64 platform with EFI 64/32
firmware.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-19 19:19:54 +02:00
Russell King
cf816ecb53 Merge branch 'merge-fixes' into devel 2008-04-19 17:17:34 +01:00
Russell King
adf6d34e46 Merge branch 'omap2-upstream' into devel 2008-04-19 17:17:29 +01:00
Russell King
d1964dab60 Merge branches 'arm', 'at91', 'ep93xx', 'iop', 'ks8695', 'misc', 'mxc', 'ns9x', 'orion', 'pxa', 'sa1100', 's3c' and 'sparsemem' into devel 2008-04-19 17:17:25 +01:00
Philipp Zabel
5dc3339aa5 [ARM] 4964/1: htc-pasic3: MFD driver for PASIC3 LED control + DS1WM chip
This driver will provide registers, clocks and GPIOs of
the HTC PASIC3 (AIC3) and PASIC2 (AIC2) chips to the
ds1wm and leds-pasic3 drivers.

Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-04-19 11:29:08 +01:00
Philipp Zabel
a1635b8fe5 [ARM] 4947/1: htc-egpio, a driver for GPIO/IRQ expanders with fixed input/output pins
implemented in CPLD chips on several HTC devices.

The original driver was written by Kevin O'Connor, I have adapted it to
use gpiolib and made the bus/register widths configurable.

Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-04-19 11:29:07 +01:00
Dave Hansen
ad775f5a8f [PATCH] r/o bind mounts: debugging for missed calls
There have been a few oopses caused by 'struct file's with NULL f_vfsmnts.
There was also a set of potentially missed mnt_want_write()s from
dentry_open() calls.

This patch provides a very simple debugging framework to catch these kinds of
bugs.  It will WARN_ON() them, but should stop us from having any oopses or
mnt_writer count imbalances.

I'm quite convinced that this is a good thing because it found bugs in the
stuff I was working on as soon as I wrote it.

[hch: made it conditional on a debug option.
      But it's still a little bit too ugly]

[hch: merged forced remount r/o fix from Dave and akpm's fix for the fix]

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-19 00:29:28 -04:00
Dave Hansen
2e4b7fcd92 [PATCH] r/o bind mounts: honor mount writer counts at remount
Originally from: Herbert Poetzl <herbert@13thfloor.at>

This is the core of the read-only bind mount patch set.

Note that this does _not_ add a "ro" option directly to the bind mount
operation.  If you require such a mount, you must first do the bind, then
follow it up with a 'mount -o remount,ro' operation:

If you wish to have a r/o bind mount of /foo on bar:

	mount --bind /foo /bar
	mount -o remount,ro /bar

Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-19 00:29:27 -04:00
Dave Hansen
3d733633a6 [PATCH] r/o bind mounts: track numbers of writers to mounts
This is the real meat of the entire series.  It actually
implements the tracking of the number of writers to a mount.
However, it causes scalability problems because there can be
hundreds of cpus doing open()/close() on files on the same mnt at
the same time.  Even an atomic_t in the mnt has massive scalaing
problems because the cacheline gets so terribly contended.

This uses a statically-allocated percpu variable.  All want/drop
operations are local to a cpu as long that cpu operates on the same
mount, and there are no writer count imbalances.  Writer count
imbalances happen when a write is taken on one cpu, and released
on another, like when an open/close pair is performed on two

Upon a remount,ro request, all of the data from the percpu
variables is collected (expensive, but very rare) and we determine
if there are any outstanding writers to the mount.

I've written a little benchmark to sit in a loop for a couple of
seconds in several cpus in parallel doing open/write/close loops.

http://sr71.net/~dave/linux/openbench.c

The code in here is a a worst-possible case for this patch.  It
does opens on a _pair_ of files in two different mounts in parallel.
This should cause my code to lose its "operate on the same mount"
optimization completely.  This worst-case scenario causes a 3%
degredation in the benchmark.

I could probably get rid of even this 3%, but it would be more
complex than what I have here, and I think this is getting into
acceptable territory.  In practice, I expect writing more than 3
bytes to a file, as well as disk I/O to mask any effects that this
has.

(To get rid of that 3%, we could have an #defined number of mounts
in the percpu variable.  So, instead of a CPU getting operate only
on percpu data when it accesses only one mount, it could stay on
percpu data when it only accesses N or fewer mounts.)

[AV] merged fix for __clear_mnt_mount() stepping on freed vfsmount

Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-19 00:29:27 -04:00
Dave Hansen
aceaf78da9 [PATCH] r/o bind mounts: create helper to drop file write access
If someone decides to demote a file from r/w to just
r/o, they can use this same code as __fput().

NFS does just that, and will use this in the next
patch.

AV: drop write access in __fput() only after we evict from file list.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Cc: Erez Zadok <ezk@cs.sunysb.edu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J Bruce Fields" <bfields@fieldses.org>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-19 00:25:32 -04:00
Dave Hansen
8366025eb8 [PATCH] r/o bind mounts: stub functions
This patch adds two function mnt_want_write() and mnt_drop_write().  These are
used like a lock pair around and fs operations that might cause a write to the
filesystem.

Before these can become useful, we must first cover each place in the VFS
where writes are performed with a want/drop pair.  When that is complete, we
can actually introduce code that will safely check the counts before allowing
r/w<->r/o transitions to occur.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-19 00:25:32 -04:00
Christoph Hellwig
a70e65df88 [PATCH] merge open_namei() and do_filp_open()
open_namei() will, in the future, need to take mount write counts
over its creation and truncation (via may_open()) operations.  It
needs to keep these write counts until any potential filp that is
created gets __fput()'d.

This gets complicated in the error handling and becomes very murky
as to how far open_namei() actually got, and whether or not that
mount write count was taken.  That makes it a bad interface.

All that the current do_filp_open() really does is allocate the
nameidata on the stack, then call open_namei().

So, this merges those two functions and moves filp_open() over
to namei.c so it can be close to its buddy: do_filp_open().  It
also gets a kerneldoc comment in the process.

Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-19 00:25:32 -04:00
Matthew Wilcox
6188e10d38 Convert asm/semaphore.h users to linux/semaphore.h
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-18 22:22:54 -04:00
Matthew Wilcox
5a6483feb0 include: Remove unnecessary inclusions of asm/semaphore.h
None of these files use any of the functionality promised by
asm/semaphore.h.  It's possible that they (or some user of them) rely
on it dragging in some unrelated header file, but I can't build all
these files, so we'll have to fix any build failures as they come up.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-18 22:16:54 -04:00
Linus Torvalds
3925e6fc1f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  security: fix up documentation for security_module_enable
  Security: Introduce security= boot parameter
  Audit: Final renamings and cleanup
  SELinux: use new audit hooks, remove redundant exports
  Audit: internally use the new LSM audit hooks
  LSM/Audit: Introduce generic Audit LSM hooks
  SELinux: remove redundant exports
  Netlink: Use generic LSM hook
  Audit: use new LSM hooks instead of SELinux exports
  SELinux: setup new inode/ipc getsecid hooks
  LSM: Introduce inode_getsecid and ipc_getsecid hooks
2008-04-18 18:18:30 -07:00
Linus Torvalds
334d094504 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.26
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.26: (1090 commits)
  [NET]: Fix and allocate less memory for ->priv'less netdevices
  [IPV6]: Fix dangling references on error in fib6_add().
  [NETLABEL]: Fix NULL deref in netlbl_unlabel_staticlist_gen() if ifindex not found
  [PKT_SCHED]: Fix datalen check in tcf_simp_init().
  [INET]: Uninline the __inet_inherit_port call.
  [INET]: Drop the inet_inherit_port() call.
  SCTP: Initialize partial_bytes_acked to 0, when all of the data is acked.
  [netdrvr] forcedeth: internal simplifications; changelog removal
  phylib: factor out get_phy_id from within get_phy_device
  PHY: add BCM5464 support to broadcom PHY driver
  cxgb3: Fix __must_check warning with dev_dbg.
  tc35815: Statistics cleanup
  natsemi: fix MMIO for PPC 44x platforms
  [TIPC]: Cleanup of TIPC reference table code
  [TIPC]: Optimized initialization of TIPC reference table
  [TIPC]: Remove inlining of reference table locking routines
  e1000: convert uint16_t style integers to u16
  ixgb: convert uint16_t style integers to u16
  sb1000.c: make const arrays static
  sb1000.c: stop inlining largish static functions
  ...
2008-04-18 18:02:35 -07:00
Ahmed S. Darwish
076c54c5bc Security: Introduce security= boot parameter
Add the security= boot parameter. This is done to avoid LSM
registration clashes in case of more than one bult-in module.

User can choose a security module to enable at boot. If no
security= boot parameter is specified, only the first LSM
asking for registration will be loaded. An invalid security
module name will be treated as if no module has been chosen.

LSM modules must check now if they are allowed to register
by calling security_module_enable(ops) first. Modify SELinux
and SMACK to do so.

Do not let SMACK register smackfs if it was not chosen on
boot. Smackfs assumes that smack hooks are registered and
the initial task security setup (swapper->security) is done.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
2008-04-19 10:00:51 +10:00
Ahmed S. Darwish
04305e4aff Audit: Final renamings and cleanup
Rename the se_str and se_rule audit fields elements to
lsm_str and lsm_rule to avoid confusion.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
2008-04-19 09:59:43 +10:00
Ahmed S. Darwish
9d57a7f9e2 SELinux: use new audit hooks, remove redundant exports
Setup the new Audit LSM hooks for SELinux.
Remove the now redundant exported SELinux Audit interface.

Audit: Export 'audit_krule' and 'audit_field' to the public
since their internals are needed by the implementation of the
new LSM hook 'audit_rule_known'.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
2008-04-19 09:53:46 +10:00
Ahmed S. Darwish
03d37d25e0 LSM/Audit: Introduce generic Audit LSM hooks
Introduce a generic Audit interface for security modules
by adding the following new LSM hooks:

audit_rule_init(field, op, rulestr, lsmrule)
audit_rule_known(krule)
audit_rule_match(secid, field, op, rule, actx)
audit_rule_free(rule)

Those hooks are only available if CONFIG_AUDIT is enabled.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Reviewed-by: Paul Moore <paul.moore@hp.com>
2008-04-19 09:52:36 +10:00
Ahmed S. Darwish
6b89a74be0 SELinux: remove redundant exports
Remove the following exported SELinux interfaces:
selinux_get_inode_sid(inode, sid)
selinux_get_ipc_sid(ipcp, sid)
selinux_get_task_sid(tsk, sid)
selinux_sid_to_string(sid, ctx, len)

They can be substitued with the following generic equivalents
respectively:
new LSM hook, inode_getsecid(inode, secid)
new LSM hook, ipc_getsecid*(ipcp, secid)
LSM hook, task_getsecid(tsk, secid)
LSM hook, sid_to_secctx(sid, ctx, len)

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Reviewed-by: Paul Moore <paul.moore@hp.com>
2008-04-19 09:52:36 +10:00
Ahmed S. Darwish
8a076191f3 LSM: Introduce inode_getsecid and ipc_getsecid hooks
Introduce inode_getsecid(inode, secid) and ipc_getsecid(ipcp, secid)
LSM hooks. These hooks will be used instead of similar exported
SELinux interfaces.

Let {inode,ipc,task}_getsecid hooks set the secid to 0 by default
if CONFIG_SECURITY is not defined or if the hook is set to
NULL (dummy). This is done to notify the caller that no valid
secid exists.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Reviewed-by: Paul Moore <paul.moore@hp.com>
2008-04-19 09:52:32 +10:00
Linus Torvalds
2cca775bae Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (137 commits)
  [SCSI] iscsi: bidi support for iscsi_tcp
  [SCSI] iscsi: bidi support at the generic libiscsi level
  [SCSI] iscsi: extended cdb support
  [SCSI] zfcp: Fix error handling for blocked unit for send FCP command
  [SCSI] zfcp: Remove zfcp_erp_wait from slave destory handler to fix deadlock
  [SCSI] zfcp: fix 31 bit compile warnings
  [SCSI] bsg: no need to set BSG_F_BLOCK bit in bsg_complete_all_commands
  [SCSI] bsg: remove minor in struct bsg_device
  [SCSI] bsg: use better helper list functions
  [SCSI] bsg: replace kobject_get with blk_get_queue
  [SCSI] bsg: takes a ref to struct device in fops->open
  [SCSI] qla1280: remove version check
  [SCSI] libsas: fix endianness bug in sas_ata
  [SCSI] zfcp: fix compiler warning caused by poking inside new semaphore (linux-next)
  [SCSI] aacraid: Do not describe check_reset parameter with its value
  [SCSI] aacraid: Fix down_interruptible() to check the return value
  [SCSI] sun3_scsi_vme: add MODULE_LICENSE
  [SCSI] st: rename flush_write_buffer()
  [SCSI] tgt: use KMEM_CACHE macro
  [SCSI] initio: fix big endian problems for auto request sense
  ...
2008-04-18 11:25:31 -07:00
Linus Torvalds
ef38ff9d37 Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (49 commits)
  [GFS2] fix assertion in log_refund()
  [GFS2] fix GFP_KERNEL misuses
  [GFS2] test for IS_ERR rather than 0
  [GFS2] Invalidate cache at correct point
  [GFS2] fs/gfs2/recovery.c: suppress warnings
  [GFS2] Faster gfs2_bitfit algorithm
  [GFS2] Streamline quota lock/check for no-quota case
  [GFS2] Remove drop of module ref where not needed
  [GFS2] gfs2_adjust_quota has broken unstuffing code
  [GFS2] possible null pointer dereference fixup
  [GFS2] Need to ensure that sector_t is 64bits for GFS2
  [GFS2] re-support special inode
  [GFS2] remove gfs2_dev_iops
  [GFS2] fix file_system_type leak on gfs2meta mount
  [GFS2] Allow bmap to allocate extents
  [GFS2] Fix a page lock / glock deadlock
  [GFS2] proper extern for gfs2/locking/dlm/mount.c:gdlm_ops
  [GFS2] gfs2/ops_file.c should #include "ops_inode.h"
  [GFS2] be*_add_cpu conversion
  [GFS2] Fix bug where we called drop_bh incorrectly
  ...
2008-04-18 10:02:46 -07:00
Linus Torvalds
188da98800 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (58 commits)
  ide: remove ide_init_default_irq() macro
  ide: move default IDE ports setup to ide_generic host driver
  ide: remove obsoleted "idex=noprobe" kernel parameter (take 2)
  ide: remove needless hwif->irq check from ide_hwif_configure()
  ide: init hwif->{io_ports,irq} explicitly in legacy VLB host drivers
  ide: limit legacy VLB host drivers to alpha, x86 and mips
  cmd640: init hwif->{io_ports,irq} explicitly
  cmd640: cleanup setup_device_ptrs()
  ide: add ide-4drives host driver (take 3)
  ide: remove ppc ifdef from init_ide_data()
  ide: remove ide_default_io_ctl() macro
  ide: remove CONFIG_IDE_ARCH_OBSOLETE_INIT
  ide: add CONFIG_IDE_ARCH_OBSOLETE_DEFAULTS (take 2)
  ppc/pmac: remove no longer needed IDE quirk
  ppc: don't include <linux/ide.h>
  ppc: remove ppc_ide_md
  ppc/pplus: remove ppc_ide_md.ide_init_hwif hook
  ppc/sandpoint: remove ppc_ide_md hooks
  ppc/lopec: remove ppc_ide_md hooks
  ppc/mpc8xx: remove ppc_ide_md hooks
  ...
2008-04-18 08:39:24 -07:00
Linus Torvalds
07fe944e87 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
  dmaengine: ack to flags: make use of the unused bits in the 'ack' field
  iop-adma: remove the workaround for missed interrupts on iop3xx
  async_tx: kill ->device_dependency_added
  async_tx: fix multiple dependency submission
  fsldma: Split the MPC83xx event from MPC85xx and refine irq codes.
  fsldma: Remove CONFIG_FSL_DMA_SELFTEST, keep fsl_dma_self_test() running always.
2008-04-18 08:38:55 -07:00
Linus Torvalds
8019aa946a Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: (79 commits)
  ata-acpi: don't call _GTF for disabled drive
  sata_mv add temporary 3 second init delay for SiliconImage PMs
  sata_mv remove redundant edma init code
  sata_mv add basic port multiplier support
  sata_mv fix SOC flags, enable NCQ on SOC
  sata_mv disable hotplug for now
  sata_mv cosmetics
  sata_mv hardreset rework
  [libata] improve Kconfig help text for new PMP, SFF options
  libata: make EH fail gracefully if no reset method is available
  libata: Be a bit more slack about early devices
  libata: cable logic
  libata: move link onlineness check out of softreset methods
  libata: kill dead code paths in reset path
  pata_scc: fix build breakage
  libata: make PMP support optional
  libata: implement PMP helpers
  libata: separate PMP support code from core code
  libata: make SFF support optional
  libata: don't use ap->ioaddr in non-SFF drivers
  ...
2008-04-18 08:38:06 -07:00
Linus Torvalds
73e3e6481f Merge git://git.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-hrt
* git://git.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-hrt:
  clocksource: make clocksource watchdog cycle through online CPUs
  Documentation: move timer related documentation to a single place
  clockevents: optimise tick_nohz_stop_sched_tick() a bit
  locking: remove unused double_spin_lock()
  hrtimers: simplify lockdep handling
  timers: simplify lockdep handling
  posix-timers: fix shadowed variables
  timer_list: add annotations to workqueue.c
  hrtimer: use nanosleep specific restart_block fields
  hrtimer: add nanosleep specific restart_block member
2008-04-18 08:37:41 -07:00
Linus Torvalds
9732b61123 Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-kgdb
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-kgdb:
  kgdb: always use icache flush for sw breakpoints
  kgdb: fix SMP NMI kgdb_handle_exception exit race
  kgdb: documentation fixes
  kgdb: allow static kgdbts boot configuration
  kgdb: add documentation
  kgdb: Kconfig fix
  kgdb: add kgdb internal test suite
  kgdb: fix several kgdb regressions
  kgdb: kgdboc pl011 I/O module
  kgdb: fix optional arch functions and probe_kernel_*
  kgdb: add x86 HW breakpoints
  kgdb: print breakpoint removed on exception
  kgdb: clocksource watchdog
  kgdb: fix NMI hangs
  kgdb: fix kgdboc dynamic module configuration
  kgdb: document parameters
  x86: kgdb support
  consoles: polling support, kgdboc
  kgdb: core
  uaccess: add probe_kernel_write()
2008-04-18 08:37:01 -07:00
Linus Torvalds
d7bb545d86 Merge branch 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc
* 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc:
  Remove DEBUG_SEMAPHORE from Kconfig
  Improve semaphore documentation
  Simplify semaphore implementation
  Add down_timeout and change ACPI to use it
  Introduce down_killable()
  Generic semaphore implementation
  Add semaphore.h to kernel_lock.c
  Fix quota.h includes
2008-04-18 08:25:29 -07:00
Linus Torvalds
75e98b3415 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (104 commits)
  IB/iser: Don't change itt endianness
  IB/mlx4: Update module version and release date
  IPoIB: Handle case when P_Key is deleted and re-added at same index
  IB/iser: Release connection resources on RDMA_CM_EVENT_DEVICE_REMOVAL event
  IB/mlx4: Fix incorrect comment
  IB/mlx4: Fix race when detaching a QP from a multicast group
  IB/ehca: Support all ibv_devinfo values in query_device() and query_port()
  RDMA/nes: Free IRQ before killing tasklet
  IB/mthca: Update module version and release date
  IB/mlx4: Update QP state if query QP succeeds
  IB/mthca: Update QP state if query QP succeeds
  RDMA/amso1100: Add check for NULL reply_msg in c2_intr()
  IB/mlx4: Add support for resizing CQs
  IB/mlx4: Add support for modifying CQ moderation parameters
  IPoIB: Support modifying IPoIB CQ event moderation
  IB/core: Add support for modify CQ
  IPoIB: Add basic ethtool support
  mlx4_core: Increase max number of QPs to 128K
  RDMA/amso1100: Add support for "send with invalidate" work requests
  IB/core: Add support for "send with invalidate" work requests
  ...
2008-04-18 08:20:06 -07:00
Linus Torvalds
4cba84b5d6 Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (36 commits)
  [S390] Remove code duplication from monreader / dcssblk.
  [S390] kernel: show last breaking-event-address on oops
  [S390] lowcore: Change type of lowcores softirq_pending to __u32.
  [S390] zcrypt: Comments and kernel-doc cleanup
  [S390] uaccess: Always access the correct address space.
  [S390] Fix a lot of sparse warnings.
  [S390] Convert s390 to GENERIC_CLOCKEVENTS.
  [S390] genirq/clockevents: move irq affinity prototypes/inlines to interrupt.h
  [S390] Convert monitor calls to function calls.
  [S390] qdio (new feature): enhancing info-retrieval from QDIO-adapters
  [S390] replace remaining __FUNCTION__ occurrences
  [S390] remove redundant display of free swap space in show_mem()
  [S390] qdio: remove outdated developerworks link.
  [S390] Add debug_register_mode() function to debug feature API
  [S390] crypto: use more descriptive function names for init/exit routines.
  [S390] switch sched_clock to store-clock-extended.
  [S390] zcrypt: add support for large random numbers
  [S390] hw_random: allow rng_dev_read() to return hardware errors.
  [S390] Vertical cpu management.
  [S390] cpu topology support for s390.
  ...
2008-04-18 08:19:15 -07:00
Linus Torvalds
7d939fbdfe Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
  slub: No need for per node slab counters if !SLUB_DEBUG
  slub: Move map/flag clearing to __free_slab
  slub: Fixes to per cpu stat output in sysfs
  slub: Deal with config variable dependencies
  slub: Reduce #ifdef ZONE_DMA by moving kmalloc_caches_dma near dma logic
  slub: Initialize per-cpu stats
2008-04-18 08:19:00 -07:00
David S. Miller
1e42198609 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 2008-04-17 23:56:30 -07:00
Bartlomiej Zolnierkiewicz
0e33555fff ide: add CONFIG_IDE_ARCH_OBSOLETE_DEFAULTS (take 2)
* Add CONFIG_IDE_ARCH_OBSOLETE_DEFAULTS to drivers/ide/Kconfig and use
  it instead of defining IDE_ARCH_OBSOLETE_DEFAULTS in <arch/ide.h>.

v2:
* Define ide_default_irq() in ide-probe.c/ns87415.c if not already defined
  and drop defining ide_default_irq() for CONFIG_IDE_ARCH_OBSOLETE_DEFAULTS=n.

  [ Thanks to Stephen Rothwell and David Miller for noticing the problem. ]

Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-04-18 00:46:33 +02:00
Borislav Petkov
eaec3e7ded ide: use generic ATAPI packet command flags in ide-{floppy,tape}
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-04-18 00:46:27 +02:00
Borislav Petkov
8303b46e18 ide: add generic packet command representation ide_atapi_pc
This new struct unifies ide{-floppy,-tape,-scsi}'s view of a packet command. For now,
it represents the common denominator between the three drivers while adding driver-
specific members at the end of the struct which will be merged/simplified into the
generic ATAPI handling code in later steps, or removed completely.

Bart:
- move struct ide_atapi_pc outside of #ifdef/#endif CONFIG_IDE_PROC_FS

Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-04-18 00:46:26 +02:00
Bartlomiej Zolnierkiewicz
23579a2a17 ide: remove IDE_*_REG macros
* Add IDE_{ALTSTATUS,IREASON,BCOUNTL,BCOUNTH}_OFFSET defines.

* Remove IDE_*_REG macros - this results in more readable
  and slightly smaller code.

There should be no functional changes caused by this patch.

Cc: Borislav Petkov <petkovbb@gmail.com>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-04-18 00:46:26 +02:00
Bartlomiej Zolnierkiewicz
7616c0ad20 ide: add ide_atapi_{discard_data,write_zeros} inline helpers
Add ide_atapi_{discard_data,write_zeros} inline helpers to <linux/ide.h>
and use them instead of home-brewn helpers in ide-{floppy,tape,scsi}.

There should be no functional changes caused by this patch.

Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-04-18 00:46:26 +02:00
Bartlomiej Zolnierkiewicz
e6bfa38a48 ide: remove ide_init_hwif_ports()
ide_init_hwif_ports() is only used by init_ide_data() now, inline it there.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-04-18 00:46:25 +02:00
Bartlomiej Zolnierkiewicz
2304dc6481 ide: remove ->hold field from ide_hwif_t (take 2)
->hold is write-only now, remove it.

v2:
* v1 missed bast-ide, palm_bk3710, ide-cs and delkin_cb host drivers.

Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-04-18 00:46:24 +02:00
Bartlomiej Zolnierkiewicz
93de00fd1c ide: remove broken/dangerous HDIO_[UNREGISTER,SCAN]_HWIF ioctls (take 3)
hdparm explicitely marks HDIO_[UNREGISTER,SCAN]_HWIF ioctls as DANGEROUS
and given the number of bugs we can assume that there are no real users:

* DMA has no chance of working because DMA resources are released by
  ide_unregister() and they are never allocated again.

* Since ide_init_hwif_ports() is used for ->io_ports[] setup the ioctls
  don't work for almost all hosts with "non-standard" (== non ISA-like)
  layout of IDE taskfile registers (there is a lot of such host drivers).

* ide_port_init_devices() is not called when probing IDE devices so:
  - drive->autotune is never set and IDE host/devices are not programmed
    for the correct PIO/DMA transfer modes (=> possible data corruption)
  - host specific I/O 32-bit and IRQ unmasking settings are not applied
    (=> possible data corruption)
  - host specific ->port_init_devs method is not called (=> no luck with
    ht6560b, qd65xx and opti621 host drivers)

* ->rw_disk method is not preserved (=> no HPT3xxN chipsets support).

* ->serialized flag is not preserved (=> possible data corruption when
   using icside, aec62xx (ATP850UF chipset), cmd640, cs5530, hpt366
   (HPT3xxN chipsets), rz1000, sc1200, dtc2278 and ht6560b host drivers).

* ->ack_intr method is not preserved (=> needed by ide-cris, buddha,
  gayle and macide host drivers).

* ->sata_scr[] and sata_misc[] is cleared by ide_unregister() and it
  isn't initialized again (SiI3112 support needs them).

* To issue an ioctl() there need to be at least one IDE device present
  in the system.

* ->cable_detect method is not preserved + it is not called when probing
  IDE devices so cable detection is broken (however since DMA support is
  also broken it doesn't really matter ;-).

* Some objects which may have already been freed in ide_unregister()
  are restored by ide_hwif_restore() (i.e. ->hwgroup).

* ide_register_hw() may unregister unrelated IDE ports if free ide_hwifs[]
  slot cannot be found.

* When IDE host drivers are modular unregistered port may be re-used by
  different host driver that owned it first causing subtle bugs.

Since we now have a proper warm-plug support remove these ioctls,
then remove no longer needed:
- ide_register_hw() and ide_hwif_restore() functions
- 'init_default' and 'restore' arguments of ide_unregister()
- zeroeing of hwif->{dma,extra}_* fields in ide_unregister()

As an added bonus IDE core code size shrinks by ~3kB (x86-32).

v2:
* fix ide_unregister() arguments in cleanup_module() (Andrew Morton).

v3:
* fix ide_unregister() arguments in palm_bk3710.c.

Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-04-18 00:46:24 +02:00
Bartlomiej Zolnierkiewicz
9a0e77f28b ide: remove obsoleted "idex=base[,ctl[,irq]]" kernel parameters (take 2)
* Remove obsoleted "idex=base[,ctl[,irq]]" kernel parameters
  and update Documentation/ide/ide.txt.

* Remove no longer needed ide_forced chipset type.

v2:
* is_chipset_set[] -> is_chipset_set in ide.c.

* Documentation/ide/ide.txt fix.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-04-18 00:46:24 +02:00
Bartlomiej Zolnierkiewicz
f74c91413e ide: add warm-plug support for IDE devices (take 2)
* Add 'struct class ide_port_class' ('ide_port' class) and a 'struct
  device *portdev' ('ide_port' class device) in ide_hwif_t.

* Register 'ide_port' class in ide_init() and unregister it in
  cleanup_module().

* Create ->portdev in ide_register_port () and unregister it in
  ide_unregister().

* Add "delete_devices" class device attribute for unregistering IDE devices
  on a port and "scan" one for probing+registering IDE devices on a port.

* Add ide_sysfs_register_port() helper for registering "delete_devices"
  and "scan" attributes with ->portdev.  Call it in ide_device_add_all().

* Document IDE warm-plug support in Documentation/ide/warm-plug-howto.txt.

v2:
* Convert patch from using 'struct class_device' to use 'struct device'.
  (thanks to Kay Sievers for doing it)

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-04-18 00:46:23 +02:00
Bartlomiej Zolnierkiewicz
50672e5d74 ide: remove dead/obsolete ->busproc method
->busproc method is used by HDIO_SET_BUSSTATE ioctl but it has no chance
of working as intended (in 2.4.x days) because to issue an ioctl there
is a device node needed and:

- for BUSSTATE_TRISTATE+OFF it is too late (devices are already gone)

- for BUSSTATE_TRISTATE+ON it is too early (devices are not registered yet)

Just remove ->busproc method for now (it was only implemented by hpt366,
siimage and tc86c001 host drivers).

Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-04-18 00:46:23 +02:00
Bartlomiej Zolnierkiewicz
2dde7861af ide: rework PowerMac media-bay support (take 2)
Rework PowerMac media-bay support in such way that instead of
un/registering the IDE interface we un/register IDE devices:

* Add ide_port_scan() helper for probing+registerering devices on a port.

* Rename ide_port_unregister_devices() to __ide_port_unregister_devices().

* Add ide_port_unregister_devices() helper for unregistering devices on a port.

* Add 'ide_hwif_t *cd_port' to 'struct media_bay_info', pass 'hwif' instead
  of hwif->index to media_bay_set_ide_infos() and use it to setup 'cd_port'.

* Use ide_port_unregister_devices() instead of ide_unregister()
  and ide_port_scan() instead of ide_register_hw() in media_bay_step().

* Unexport ide_register_hw() and make it static.

v2:
* Fix build by adding <linux/ide.h> include to <asm-powerpc/mediabay.h>.
  (Reported by Michael/Kamalesh/Andrew).

Cc: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-04-18 00:46:23 +02:00
Bartlomiej Zolnierkiewicz
5b0c4b30a6 ide: remove IDE devices from /proc/ide/ before unregistering them
IDE devices need to be removed from /proc/ide/ _before_ being unregistered:

* Drop 'ide_hwif_t *hwif' argument from destroy_proc_ide_device()
  and use drive->hwif instead.

* Rename destroy_proc_ide_device() to ide_proc_unregister_device().

* Call ide_proc_unregister_device() in drive_release_dev().

* Remove no longer needed destroy_proc_ide_drives().

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-04-18 00:46:22 +02:00
Bartlomiej Zolnierkiewicz
4f0eee4d87 ide: use ide_find_port() instead of ide_deprecated_find_port()
* Use ide_find_port() instead of ide_deprecated_find_port() in bast-ide/
  palm_bk3710/ide-cs/delkin_cb host drivers and in ide_register_hw().

* Remove no longer needed ide_deprecated_find_port().

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-04-18 00:46:21 +02:00
Greg Kroah-Hartman
a594eeb1a1 IDE: remove ide=reverse IDE core
This option is obsolete and can be removed safely.

It allows us to remove the pci_get_device_reverse() function from the
PCI core.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-04-18 00:46:20 +02:00
Dan Williams
636bdeaa12 dmaengine: ack to flags: make use of the unused bits in the 'ack' field
'ack' is currently a simple integer that flags whether or not a client is done
touching fields in the given descriptor.  It is effectively just a single bit
of information.  Converting this to a flags parameter allows the other bits to
be put to use to control completion actions, like dma-unmap, and capture
results, like xor-zero-sum == 0.

Changes are one of:
1/ convert all open-coded ->ack manipulations to use async_tx_ack
   and async_tx_test_ack.
2/ set the ack bit at prep time where possible
3/ make drivers store the flags at prep time
4/ add flags to the device_prep_dma_interrupt prototype

Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-04-17 13:25:54 -07:00
Dan Williams
ce4d65a5db async_tx: kill ->device_dependency_added
DMA drivers no longer need to be notified of dependency submission
events as async_tx_run_dependencies and async_tx_channel_switch will
handle the scheduling and execution of dependent operations.

[sfr@canb.auug.org.au: extend this for fsldma]
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-04-17 13:25:54 -07:00
Dan Williams
19242d7233 async_tx: fix multiple dependency submission
Shrink struct dma_async_tx_descriptor and introduce
async_tx_channel_switch to properly inject a channel switch interrupt in
the descriptor stream.  This simplifies the locking model as drivers no
longer need to handle dma_async_tx_descriptor.lock.

Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-04-17 13:25:05 -07:00
Tejun Heo
88fcd56275 libata: make PMP support optional
Make PMP support optional by adding CONFIG_SATA_PMP and leaving out
libata-pmp.c if it isn't set.  PMP helpers return constant values if
PMP support is not enabled and PMP declarations alias non-PMP
counterparts.  This makes the compiler to leave out PMP related part
out and LLDs to use non-PMP counterparts automatically.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:25 -04:00
Tejun Heo
071f44b1d2 libata: implement PMP helpers
Implement helpers to test whether PMP is supported, attached and
determine pmp number to use when issuing SRST to a link.  While at it,
move ata_is_host_link() so that it's together with the two new PMP
helpers.

This change simplifies LLDs and helps making PMP support optional.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:25 -04:00
Tejun Heo
48515f6c00 libata: separate PMP support code from core code
Most of PMP support code is already in libata-pmp.c.  All that are in
libata-core.c are sata_pmp_port_ops and EXPORTs.  Move them to
libata-pmp.c.  Also, collect PMP related prototypes and declarations
in header files and move them right above of SFF stuff.

This change is to make PMP support optional.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:25 -04:00
Tejun Heo
127102aea2 libata: make SFF support optional
Now that SFF support is completely separated out from the core layer,
it can be made optional.  Add CONFIG_ATA_SFF and let SFF drivers
depend on it.  If CONFIG_ATA_SFF isn't set, all codes in libata-sff.c
and data structures for SFF support are disabled.  This saves good
number of bytes for small systems.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:24 -04:00
Tejun Heo
c9f75b04ed libata: kill ata_noop_dev_select()
Now that SFF assumptions are separated out from non-SFF reset
sequence, port_ops->sff_dev_select() is no longer necessary for
non-SFF controllers.  Kill ata_noop_dev_select() and ->sff_dev_select
initialization from base and other non-SFF port_ops.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:24 -04:00
Tejun Heo
79f97dadfe libata: drop @finish_qc from ata_qc_complete_multiple()
ata_qc_complete_multiple() took @finish_qc and called it on every qc
before completing it.  This was to give opportunity to update TF cache
before ata_qc_complete() tries to fill result_tf.  Now that result TF
is a separate operation, this is no longer necessary.

Update sata_sil24, which was the only user of this mechanism, such
that it implements its own ops->qc_fill_rtf() and drop @finish_qc from
ata_qc_complete_multiple().

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:23 -04:00
Tejun Heo
22183bf569 libata: add qc_fill_rtf port operation
On command completion, ata_qc_complete() directly called ops->tf_read
to fill qc->result_tf.  This patch adds ops->qc_fill_rtf to replace
hardcoded ops->tf_read usage.

ata_sff_qc_fill_rtf() which uses ops->tf_read to fill result_tf is
implemented and set in ata_base_port_ops and other ops tables which
don't inherit from ata_base_port_ops, so this patch doesn't introduce
any behavior change.

ops->qc_fill_rtf() is similar to ops->sff_tf_read() but can only be
called when a command finishes.  As some non-SFF controllers don't
have TF registers defined unless they're associated with in-flight
commands, this limited operation makes life easier for those drivers
and help lifting SFF assumptions from libata core layer.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:23 -04:00
Tejun Heo
5958e3025f libata: move PMP SCR access failure during reset to ata_eh_reset()
If PMP fan-out reset fails and SCR isn't accessible, PMP should be
reset.  This used to be tested by sata_pmp_std_hardreset() and
communicated to EH by -ERESTART.  However, this logic is generic and
doesn't really have much to do with specific hardreset implementation.

This patch moves SCR access failure detection logic to ata_eh_reset()
where it belongs.  As this makes sata_pmp_std_hardreset() identical to
sata_std_hardreset(), the function is killed and replaced with the
standard method.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:23 -04:00
Tejun Heo
ac371987a8 libata: clear SError after link resume
SError used to be cleared in ->postreset.  This has small hotplug race
condition.  If a device is plugged in after reset is complete but
postreset hasn't run yet, its hotplug event gets lost when SError is
cleared.  This patch makes sata_link_resume() clear SError.  This
kills the race condition and makes a lot of sense as some PMP and host
PHYs don't work properly without SError cleared.

This change makes sata_pmp_std_{pre|post}_reset()'s unnecessary as
they become identical to ata_std counterparts.  It also simplifies
sata_pmp_hardreset() and ahci_vt8251_hardreset().

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:23 -04:00
Tejun Heo
57c9efdfb3 libata: implement and use sata_std_hardreset()
Implement sata_std_hardreset(), which simply wraps around
sata_link_hardreset().  sata_std_hardreset() becomes new standard
hardreset method for sata_port_ops and sata_sff_hardreset() moves from
ata_base_port_ops to ata_sff_port_ops, which is where it really
belongs.

ata_is_builtin_hardreset() is added so that both
ata_std_error_handler() and ata_sff_error_handler() skip both builtin
hardresets if SCR isn't accessible.

piix_sidpr_hardreset() in ata_piix.c is identical to
sata_std_hardreset() in functionality and got replaced with the
standard function.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:23 -04:00
Tejun Heo
9dadd45b24 libata: move generic hardreset code from sata_sff_hardreset() to sata_link_hardreset()
sata_sff_hardreset() contains link readiness wait logic which isn't
SFF specific.  Move that part into sata_link_hardreset(), which now
takes two more parameters - @online and @check_ready.  Both are
optional.  The former is out parameter for link onlineness after
reset.  The latter is used to wait for link readiness after hardreset.

Users of sata_link_hardreset() is updated to use new funtionality and
ahci_hardreset() is updated to use sata_link_hardreset() instead of
sata_sff_hardreset().  This doesn't really cause any behavior change.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:22 -04:00
Tejun Heo
aa2731ad9a libata: separate out ata_wait_ready() and implement ata_wait_after_reset()
Factor out waiting logic (which is common to all ATA controllers) from
ata_sff_wait_ready() into ata_wait_ready().  ata_wait_ready() takes
@check_ready function pointer and uses it to poll for readiness.  This
allows non-SFF controllers to use ata_wait_ready() to wait for link
readiness.

This patch also implements ata_wait_after_reset() - generic version of
ata_sff_wait_after_reset() - using ata_wait_ready().

ata_sff_wait_ready() is reimplemented using ata_wait_ready() and
ata_sff_check_ready().  Functionality remains the same.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:22 -04:00
Tejun Heo
705e76beb9 libata: restructure SFF post-reset readiness waits
Previously, post-softreset readiness is waited as follows.

1. ata_sff_wait_after_reset() waits for 150ms and then for
   ATA_TMOUT_FF_WAIT if status is 0xff and other conditions meet.

2. ata_bus_softreset() finishes with -ENODEV if status is still 0xff.
   If not, continue to #3.

3. ata_bus_post_reset() waits readiness of dev0 and/or dev1 depending
   on devmask using ata_sff_wait_ready().

And for post-hardreset readiness,

1. ata_sff_wait_after_reset() waits for 150ms and then for
   ATA_TMOUT_FF_WAIT if status is 0xff and other conditions meet.

2. sata_sff_hardreset waits for device readiness using
   ata_sff_wait_ready().

This patch merges and unifies post-reset readiness waits into
ata_sff_wait_ready() and ata_sff_wait_after_reset().

ATA_TMOUT_FF_WAIT handling is merged into ata_sff_wait_ready().  If TF
status is 0xff, link status is unknown and the port is SATA, it will
continue polling till ATA_TMOUT_FF_WAIT.

ata_sff_wait_after_reset() is updated to perform the following steps.

1. waits for 150ms.

2. waits for dev0 readiness using ata_sff_wait_ready().  Note that
   this is done regardless of devmask, as ata_sff_wait_ready() handles
   0xff status correctly, this preserves the original behavior except
   that it may wait longer after softreset if link is online but
   status is 0xff.  This behavior change is very unlikely to cause any
   actual difference and is intended.  It brings softreset behavior to
   that of hardreset.

3. waits for dev1 readiness just the same way ata_bus_post_reset() did.

Now both soft and hard resets call ata_sff_wait_after_reset() after
reset to wait for readiness after resets.  As
ata_sff_wait_after_reset() contains calls to ->sff_dev_select(),
explicit call near the end of sata_sff_hardreset() is removed.

This change makes reset implementation simpler and more consistent.

While at it, make the magical 150ms wait post-reset wait duration a
constant and ata_sff_wait_ready() and ata_sff_wait_after_reset() take
@link instead of @ap.  This is to make them consistent with other
reset helpers and ease core changes.

pata_scc is updated accordingly.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:22 -04:00
Tejun Heo
203c75b824 libata: separate out ata_std_postreset() from ata_sff_postreset()
Separate out generic ATA portion from ata_sff_postreset() into
ata_std_postreset() and implement ata_sff_postreset() using the std
version.

ata_base_port_ops now has ata_std_postreset() for its postreset and
ata_sff_port_ops overrides it to ata_sff_postreset().

This change affects pdc_adma, ahci, sata_fsl and sata_sil24.  pdc_adma
now specifies postreset to ata_sff_postreset() explicitly.  sata_fsl
and sata_sil24 now use ata_std_postreset() which makes no difference
to them.  ahci now calls ata_std_postreset() from its own postreset
method, which causes no behavior difference.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:22 -04:00
Tejun Heo
0aa1113d54 libata: separate out ata_std_prereset() from ata_sff_prereset()
Separate out generic ATA portion from ata_sff_prereset() into
ata_std_prereset() and implement ata_sff_prereset() using the std
version.  Waiting for device readiness is the only SFF specific part.

ata_base_port_ops now has ata_std_prereset() for its prereset and
ata_sff_port_ops overrides it to ata_sff_prereset().  This change can
affect pdc_adma, ahci, sata_fsl and sata_sil24.  pdc_adma implements
its own prereset using ata_sff_prereset() and the rest has hardreset
and thus are unaffected by this change.

This change reflects real world situation.  There is no generic way to
wait for device readiness for non-SFF controllers and some of them
don't have any mechanism for that.  Non-sff drivers which don't have
hardreset should wrap ata_std_prereset() and wait for device readiness
itself but there's no such driver now and isn't likely to be popular
in the future either.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:22 -04:00
Tejun Heo
288623a06c libata: clean up port_ops->sff_irq_clear()
->sff_irq_clear() is called only from SFF interrupt handler, so there
is no reason to initialize it for non-SFF controllers.  Also,
ata_sff_irq_clear() can handle both BMDMA and non-BMDMA SFF
controllers.

This patch kills ata_noop_irq_clear() and removes it from base
port_ops and sets ->sff_irq_clear to ata_sff_irq_clear() in sff
port_ops instead of bmdma port_ops.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:22 -04:00
Tejun Heo
5682ed33aa libata: rename SFF port ops
Add sff_ prefix to SFF specific port ops.

This rename is in preparation of separating SFF support out of libata
core layer.  This patch strictly renames ops and doesn't introduce any
behavior difference.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:22 -04:00
Tejun Heo
9363c3825e libata: rename SFF functions
SFF functions have confusing names.  Some have sff prefix, some have
bmdma, some std, some pci and some none.  Unify the naming by...

* SFF functions which are common to both BMDMA and non-BMDMA are
  prefixed with ata_sff_.

* SFF functions which are specific to BMDMA are prefixed with
  ata_bmdma_.

* SFF functions which are specific to PCI but apply to both BMDMA and
  non-BMDMA are prefixed with ata_pci_sff_.

* SFF functions which are specific to PCI and BMDMA are prefixed with
  ata_pci_bmdma_.

* Drop generic prefixes from LLD specific routines.  For example,
  bfin_std_dev_select -> bfin_dev_select.

The following renames are noteworthy.

  ata_qc_issue_prot() -> ata_sff_qc_issue()
  ata_pci_default_filter() -> ata_bmdma_mode_filter()
  ata_dev_try_classify() -> ata_sff_dev_classify()

This rename is in preparation of separating SFF support out of libata
core layer.  This patch strictly renames functions and doesn't
introduce any behavior difference.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:21 -04:00
Yoichi Yuasa
83c063dd73 use ATA_TAG_INTERNAL in ata_tag_internal()
It should be ATA_TAG_INTERNAL.

Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-04-17 15:44:21 -04:00
Tejun Heo
03faab7827 libata: implement ATA_QCFLAG_RETRY
Currently whether a command should be retried after failure is
determined inside ata_eh_finish().  Add ATA_QCFLAG_RETRY and move the
logic into ata_eh_autopsy().  This makes things clearer and helps
extending retry determination logic.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-04-17 15:44:20 -04:00
Tejun Heo
6fd3639011 libata: kill ata_chk_status()
ata_chk_status() just calls ops->check_status and it only adds
confusion with other status functions.  Kill it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-04-17 15:44:18 -04:00
Tejun Heo
071ce34d57 libata: move ata_pci_default_filter() out of CONFIG_PCI
ata_pci_default_filter() doesn't really have anything to do with PCI.
It's generally applicable to BMDMA controllers.  Move it out of
CONFIG_PCI.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-04-17 15:44:18 -04:00
Tejun Heo
624d5c514e libata: reorganize SFF related stuff
* Move SFF related functions from libata-core.c to libata-sff.c.

  ata_[bmdma_]sff_port_ops, ata_devchk(), ata_dev_try_classify(),
  ata_std_dev_select(), ata_tf_to_host(), ata_busy_sleep(),
  ata_wait_after_reset(), ata_wait_ready(), ata_bus_post_reset(),
  ata_bus_softreset(), ata_bus_reset(), ata_std_softreset(),
  sata_std_hardreset(), ata_fill_sg(), ata_fill_sg_dumb(),
  ata_qc_prep(), ata_dump_qc_prep(), ata_data_xfer(),
  ata_data_xfer_noirq(), ata_pio_sector(), ata_pio_sectors(),
  atapi_send_cdb(), __atapi_pio_bytes(), atapi_pio_bytes(),
  ata_hsm_ok_in_wq(), ata_hsm_qc_complete(), ata_hsm_move(),
  ata_pio_task(), ata_qc_issue_prot(), ata_host_intr(),
  ata_interrupt(), ata_std_ports()

* Make ata_pio_queue_task() global as it's now called from
  libata-sff.c.

* Move SFF related stuff in include/linux/libata.h and
  drivers/ata/libata.h into one place.  While at it, move timing
  constants into the global enum definition and fortify comments a
  bit.

This patch strictly moves stuff around and as such doesn't cause any
functional difference.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-04-17 15:44:18 -04:00
Tejun Heo
a1efdaba2d libata: make reset related methods proper port operations
Currently reset methods are not specified directly in the
ata_port_operations table.  If a LLD wants to use custom reset
methods, it should construct and use a error_handler which uses those
reset methods.  It's done this way for two reasons.

First, the ops table already contained too many methods and adding
four more of them would noticeably increase the amount of necessary
boilerplate code all over low level drivers.

Second, as ->error_handler uses those reset methods, it can get
confusing.  ie. By overriding ->error_handler, those reset ops can be
made useless making layering a bit hazy.

Now that ops table uses inheritance, the first problem doesn't exist
anymore.  The second isn't completely solved but is relieved by
providing default values - most drivers can just override what it has
implemented and don't have to concern itself about higher level
callbacks.  In fact, there currently is no driver which actually
modifies error handling behavior.  Drivers which override
->error_handler just wraps the standard error handler only to prepare
the controller for EH.  I don't think making ops layering strict has
any noticeable benefit.

This patch makes ->prereset, ->softreset, ->hardreset, ->postreset and
their PMP counterparts propoer ops.  Default ops are provided in the
base ops tables and drivers are converted to override individual reset
methods instead of creating custom error_handler.

* ata_std_error_handler() doesn't use sata_std_hardreset() if SCRs
  aren't accessible.  sata_promise doesn't need to use separate
  error_handlers for PATA and SATA anymore.

* softreset is broken for sata_inic162x and sata_sx4.  As libata now
  always prefers hardreset, this doesn't really matter but the ops are
  forced to NULL using ATA_OP_NULL for documentation purpose.

* pata_hpt374 needs to use different prereset for the first and second
  PCI functions.  This used to be done by branching from
  hpt374_error_handler().  The proper way to do this is to use
  separate ops and port_info tables for each function.  Converted.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:18 -04:00
Tejun Heo
9594719362 libata: kill port_info->sht and ->irq_handler
libata core layer doesn't care about sht or ->irq_handler.  Those are
only of interest to the LLD during initialization.  This is confusing
and has caused several drivers to have duplicate unused initializers
for these fields.

Currently only sata_nv uses these fields.  Make sata_nv use
->private_data, which is supposed to carry LLD-specific information,
instead and kill ->sht and ->irq_handler.  nv_pi_priv structure is
defined and struct literals are used to initialize private_data.
Notational overhead is negligible.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:17 -04:00
Tejun Heo
887125e374 libata: stop overloading port_info->private_data
port_info->private_data is currently used for two purposes - to record
private data about the port_info or to specify host->private_data to
use when allocating ata_host.

This overloading is confusing and counter-intuitive in that
port_info->private_data becomes host->private_data instead of
port->private_data.  In addition, port_info and host don't correspond
to each other 1-to-1.  Currently, the first non-NULL
port_info->private_data is used.

This patch makes port_info->private_data just be what it is -
private_data for the port_info where LLD can jot down extra info.
libata no longer sets host->private_data to the first non-NULL
port_info->private_data, @host_priv argument is added to
ata_pci_init_one() instead.  LLDs which use ata_pci_init_one() can use
this argument to pass in pointer to host private data.  LLDs which
don't should use init-register model anyway and can initialize
host->private_data directly.

Adding @host_priv instead of using init-register model for LLDs which
use ata_pci_init_one() is suggested by Alan Cox.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
2008-04-17 15:44:17 -04:00
Tejun Heo
1bd5b715a3 libata: make ata_pci_init_one() not use ops->irq_handler and pi->sht
ata_pci_init_one() is the only function which uses ops->irq_handler
and pi->sht.  Other initialization functions take the same information
as arguments.  This causes confusion and duplicate unused entries in
structures.

Make ata_pci_init_one() take sht as an argument and use ata_interrupt
implicitly.  All current users use ata_interrupt and if different irq
handler is necessary open coding ata_pci_init_one() using
ata_prepare_sff_host() and ata_activate_sff_host can be done under ten
lines including error handling and driver which requires custom
interrupt handler is likely to require custom initialization anyway.

As ata_pci_init_one() was the last user of ops->irq_handler, this
patch also kills the field.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:17 -04:00
Tejun Heo
029cfd6b74 libata: implement and use ops inheritance
libata lets low level drivers build ata_port_operations table and
register it with libata core layer.  This allows low level drivers
high level of flexibility but also burdens them with lots of
boilerplate entries.

This becomes worse for drivers which support related similar
controllers which differ slightly.  They share most of the operations
except for a few.  However, the driver still needs to list all
operations for each variant.  This results in large number of
duplicate entries, which is not only inefficient but also error-prone
as it becomes very difficult to tell what the actual differences are.

This duplicate boilerplates all over the low level drivers also make
updating the core layer exteremely difficult and error-prone.  When
compounded with multi-branched development model, it ends up
accumulating inconsistencies over time.  Some of those inconsistencies
cause immediate problems and fixed.  Others just remain there dormant
making maintenance increasingly difficult.

To rectify the problem, this patch implements ata_port_operations
inheritance.  To allow LLDs to easily re-use their own ops tables
overriding only specific methods, this patch implements poor man's
class inheritance.  An ops table has ->inherits field which can be set
to any ops table as long as it doesn't create a loop.  When the host
is started, the inheritance chain is followed and any operation which
isn't specified is taken from the nearest ancestor which has it
specified.  This operation is called finalization and done only once
per an ops table and the LLD doesn't have to do anything special about
it other than making the ops table non-const such that libata can
update it.

libata provides four base ops tables lower drivers can inherit from -
base, sata, pmp, sff and bmdma.  To avoid overriding these ops
accidentaly, these ops are declared const and LLDs should always
inherit these instead of using them directly.

After finalization, all the ops table are identical before and after
the patch except for setting .irq_handler to ata_interrupt in drivers
which didn't use to.  The .irq_handler doesn't have any actual effect
and the field will soon be removed by later patch.

* sata_sx4 is still using old style EH and currently doesn't take
  advantage of ops inheritance.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:17 -04:00
Tejun Heo
68d1d07b51 libata: implement and use SHT initializers
libata lets low level drivers build scsi_host_template and register it
to the SCSI layer.  This allows low level drivers high level of
flexibility but also burdens them with lots of boilerplate entries.

This patch implements SHT initializers which can be used to initialize
all the boilerplate entries in a sht.  Three variants of them are
implemented - BASE, BMDMA and NCQ - for different types of drivers.
Note that entries can be overriden by putting individual initializers
after the helper macro.

All sht tables are identical before and after this patch.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:17 -04:00
Tejun Heo
358f9a77a6 libata: implement and use ata_noop_irq_clear()
->irq_clear() is used to clear IRQ bit of a SFF controller and isn't
useful for drivers which don't use libata SFF HSM implementation.
However, it's a required callback and many drivers implement their own
noop version as placeholder.  This patch implements ata_noop_irq_clear
and use it to replace those custom placeholders.

Also, SFF drivers which don't support BMDMA don't need to use
ata_bmdma_irq_clear().  It becomes noop if BMDMA address isn't
initialized.  Convert them to use ata_noop_irq_clear().

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:16 -04:00
Tejun Heo
c1bc899f58 libata: reorganize ata_port_operations
Over the time, ops in ata_port_operations has become a bit confusing.
Reorganize.  SFF/BMDMA ops are separated into separate a group as they
will be taken out of ata_port_operations later.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:16 -04:00
Tejun Heo
b558edddb1 libata: kill ata_ehi_schedule_probe()
ata_ehi_schedule_probe() was created to hide details of link-resuming
reset magic.  Now that all the softreset workarounds are gone,
scheduling probe is very simple - set probe_mask and request RESET.
Kill ata_ehi_schedule_probe() and open code it.  This also increases
consistency as ata_ehi_schedule_probe() couldn't cover individual
device probings so they were open-coded even when the helper existed.

While at it, define ATA_ALL_DEVICES as mask of all possible devices on
a link and always use it when requesting probe on link level for
simplicity and consistency.  Setting extra bits in the probe_mask
doesn't hurt anybody.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:16 -04:00
Tejun Heo
8cebf274dd libata: kill ATA_LFLAG_SKIP_D2H_BSY
Some controllers can't reliably record the initial D2H FIS after SATA
link is brought online for whatever reason.  Advanced controllers
which don't have traditional TF register based interface often have
this problem as they don't really have the TF registers to update
while the controller and link are being initialized.

SKIP_D2H_BSY works around the problem by skipping the wait for device
readiness before issuing SRST, so for such controllers libata issues
SRST blindly and hopes for the best.

Now that libata defaults to hardreset, this workaround is no longer
necessary.  For controllers which have support for hardreset, SRST is
never issued by itself.  It is only issued as follow-up SRST for
device classification and PMP initialization, so there's no need to
wait for it from prereset.

Kill ATA_LFLAG_SKIP_D2H_BSY.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:16 -04:00
Tejun Heo
672b2d65ba libata: kill ATA_EHI_RESUME_LINK
ATA_EHI_RESUME_LINK has two functions - promote reset to hardreset if
ATA_LFLAG_HRST_TO_RESUME is set and preventing EH from shortcutting
reset action when probing is requested.  The former is gone now and
the latter can easily be achieved by making EH to perform at least one
reset if reset is requested, which also makes more sense than
depending on RESUME_LINK flag.

As ATA_EHI_RESUME_LINK was the only EHI reset modifier, this also
kills reset modifier handling.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:16 -04:00
Tejun Heo
d692abd92f libata: kill ATA_LFLAG_HRST_TO_RESUME
Now that hardreset is the preferred method of resetting, there's no
need for ATA_LFLAG_HRST_TO_RESUME flag.  Kill it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:15 -04:00
Tejun Heo
cf48062658 libata: prefer hardreset
When both soft and hard resets are available, libata preferred
softreset till now.  The logic behind it was to be softer to devices;
however, this doesn't really help much.  Rationales for the change:

* BIOS may freeze lock certain things during boot and softreset can't
  unlock those.  This by itself is okay but during operation PHY event
  or other error conditions can trigger hardreset and the device may
  end up with different configuration.

  For example, after a hardreset, previously unlockable HPA can be
  unlocked resulting in different device size and thus revalidation
  failure.  Similar condition can occur during or after resume.

* Certain ATAPI devices require hardreset to recover after certain
  error conditions.  On PATA, this is done by issuing the DEVICE RESET
  command.  On SATA, COMRESET has equivalent effect.  The problem is
  that DEVICE RESET needs its own execution protocol.

  For SFF controllers with bare TF access, it can be easily
  implemented but more advanced controllers (e.g. ahci and sata_sil24)
  require specialized implementations.  Simply using hardreset solves
  the problem nicely.

* COMRESET initialization sequence is the norm in SATA land and many
  SATA devices don't work properly if only SRST is used.  For example,
  some PMPs behave this way and libata works around by always issuing
  hardreset if the host supports PMP.

  Like the above example, libata has developed a number of mechanisms
  aiming to promote softreset to hardreset if softreset is not going
  to work.  This approach is time consuming and error prone.

  Also, note that, dependingon how you read the specs, it could be
  argued that PMP fan-out ports require COMRESET to start operation.
  In fact, all the PMPs on the market except one don't work properly
  if COMRESET is not issued to fan-out ports after PMP reset.

* COMRESET is an integral part of SATA connection and any working
  device should be able to handle COMRESET properly.  After all, it's
  the way to signal hardreset during reboot.  This is the most used
  and recommended (at least by the ahci spec) method of resetting
  devices.

So, this patch makes libata prefer hardreset over softreset by making
the following changes.

* Rename ATA_EH_RESET_MASK to ATA_EH_RESET and use it whereever
  ATA_EH_{SOFT|HARD}RESET used to be used.  ATA_EH_{SOFT|HARD}RESET is
  now only used to tell prereset whether soft or hard reset will be
  issued.

* Strip out now unneeded promote-to-hardreset logics from
  ata_eh_reset(), ata_std_prereset(), sata_pmp_std_prereset() and
  other places.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:15 -04:00
Paul Gortmaker
cac1f3c8a8 phylib: factor out get_phy_id from within get_phy_device
We were already doing what amounts to a get_phy_id from within
get_phy_device, and rather than duplicate this for the TBIPA
probing, we might as well just factor it out and make it available
instead.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-04-17 15:31:33 -04:00
Jason Wessel
e3e2aaf7dc kgdb: add documentation
Add in the kgdb documentation for kgdb.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 20:05:42 +02:00
Jason Wessel
7c3078b637 kgdb: clocksource watchdog
In order to not trip the clocksource watchdog, kgdb must touch the
clocksource watchdog on the return to normal system run state.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 20:05:38 +02:00
Jason Wessel
f2d937f3bf consoles: polling support, kgdboc
polled console handling support, to access a console in an irq-less
way while in debug or irq context.

absolutely zero impact as long as CONFIG_CONSOLE_POLL is disabled.
(which is the default)

[ jan.kiszka@siemens.com: lots of cleanups ]
[ mingo@elte.hu: redesign, splitups, cleanups. ]

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17 20:05:37 +02:00
Jason Wessel
dc7d552705 kgdb: core
kgdb core code. Handles the protocol and the arch details.

[ mingo@elte.hu: heavily modified, simplified and cleaned up. ]
[ xemul@openvz.org: use find_task_by_pid_ns ]

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17 20:05:37 +02:00
Ingo Molnar
c33fa9f560 uaccess: add probe_kernel_write()
add probe_kernel_read() and probe_kernel_write().

Uninlined and restricted to kernel range memory only, as suggested
by Linus.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17 20:05:36 +02:00
Matthew Wilcox
714493cd54 Improve semaphore documentation
Move documentation from semaphore.h to semaphore.c as requested by
Andrew Morton.  Also reformat to kernel-doc style and add some more
notes about the implementation.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-17 10:43:01 -04:00
Matthew Wilcox
b17170b2fa Simplify semaphore implementation
By removing the negative values of 'count' and relying on the wait_list to
indicate whether we have any waiters, we can simplify the implementation
by removing the protection against an unlikely race condition.  Thanks to
David Howells for his suggestions.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-17 10:42:54 -04:00
Matthew Wilcox
f1241c87a1 Add down_timeout and change ACPI to use it
ACPI currently emulates a timeout for semaphores with calls to
down_trylock and sleep.  This produces horrible behaviour in terms of
fairness and excessive wakeups.  Now that we have a unified semaphore
implementation, adding a real down_trylock is almost trivial.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-17 10:42:46 -04:00
Matthew Wilcox
f06d968658 Introduce down_killable()
down_killable() is the functional counterpart of mutex_lock_killable.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-17 10:42:40 -04:00
Matthew Wilcox
64ac24e738 Generic semaphore implementation
Semaphores are no longer performance-critical, so a generic C
implementation is better for maintainability, debuggability and
extensibility.  Thanks to Peter Zijlstra for fixing the lockdep
warning.  Thanks to Harvey Harrison for pointing out that the
unlikely() was unnecessary.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 10:42:34 -04:00
Matthew Wilcox
8b91de2e58 Fix quota.h includes
quota.h currently relies on asm/semaphore.h (through some chain; it
doesn't actually include semaphore.h itself) to include wait.h.  As
well as being bad practice to rely on an implicit include, subsequent
patches will break this.  While I'm in this file, add atomic.h and
list.h, and sort the list of includes.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-17 10:42:14 -04:00
Mark Brown
34d278534d Input: wm97xx-core - support use as a wakeup source
The WM97xx touch screen controllers can be used to generate a wakeup
event when the system is suspended. Provide a new core API call
wm97xx_set_suspend_mode() allowing machine drivers to enable this. If no
suspend_mode is provided then the touch panel will be powered down when
the system is suspended.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2008-04-17 09:24:58 -04:00
Christoph Hellwig
15aebd2866 udf: move headers out include/linux/
There's really no reason to keep udf headers in include/linux as they're
not used by anything but fs/udf/.

This patch merges most of include/linux/udf_fs_i.h into fs/udf/udf_i.h,
include/linux/udf_fs_sb.h into fs/udf/udf_sb.h and
include/linux/udf_fs.h into fs/udf/udfdecl.h.

The only thing remaining in include/linux/ is a stub of udf_fs_i.h
defining the four user-visible udf ioctls.  It's also moved from
unifdef-y to headers-y because it can be included unconditionally now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:23 +02:00
Oleg Nesterov
3f3eafc921 locking: remove unused double_spin_lock()
double_spin_lock() has no callers, and it can't be used without additional
lockdep annotations, remove it.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17 12:22:31 +02:00
Oleg Nesterov
8e60e05fdc hrtimers: simplify lockdep handling
In order to avoid the false positive from lockdep, each per-cpu base->lock has
the separate lock class and migrate_hrtimers() uses double_spin_lock().

This is overcomplicated: except for migrate_hrtimers() we never take 2 locks
at once, and migrate_hrtimers() can use spin_lock_nested().

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17 12:22:31 +02:00
Thomas Gleixner
a332d86d3c hrtimer: add nanosleep specific restart_block member
The back and forth typecasting of restart_block->args is horrible. We
added a separate union member for futex already. Do the same for
nanosleep.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17 12:22:30 +02:00
Russell King
d7b906897e [S390] genirq/clockevents: move irq affinity prototypes/inlines to interrupt.h
> Generic code is not supposed to include irq.h. Replace this include
> by linux/hardirq.h instead and add/replace an include of linux/irq.h
> in asm header files where necessary.
> This change should only matter for architectures that make use of
> GENERIC_CLOCKEVENTS.
> Architectures in question are mips, x86, arm, sh, powerpc, uml and sparc64.
>
> I did some cross compile tests for mips, x86_64, arm, powerpc and sparc64.
> This patch fixes also build breakages caused by the include replacement in
> tick-common.h.

I generally dislike adding optional linux/* includes in asm/* includes -
I'm nervous about this causing include loops.

However, there's a separate point to be discussed here.

That is, what interfaces are expected of every architecture in the kernel.
If generic code wants to be able to set the affinity of interrupts, then
that needs to become part of the interfaces listed in linux/interrupt.h
rather than linux/irq.h.

So what I suggest is this approach instead (against Linus' tree of a
couple of days ago) - we move irq_set_affinity() and irq_can_set_affinity()
to linux/interrupt.h, change the linux/irq.h includes to linux/interrupt.h
and include asm/irq_regs.h where needed (asm/irq_regs.h is supposed to be
rarely used include since not much touches the stacked parent context
registers.)

Build tested on ARM PXA family kernels and ARM's Realview platform
kernels which both use genirq.

[ tglx@linutronix.de: add GENERIC_HARDIRQ dependencies ]

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-17 07:47:05 +02:00
Vladimir Sokolovsky
bbf8eed1a0 IB/mlx4: Add support for resizing CQs
Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16 21:09:33 -07:00
Eli Cohen
3fdcb97f0b IB/mlx4: Add support for modifying CQ moderation parameters
Signed-off-by: Eli Cohen <eli@mellnaox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16 21:09:33 -07:00
Eli Cohen
b832be1e40 IB/mlx4: Add IPoIB LSO support
Add TSO support to the mlx4_ib driver.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16 21:09:27 -07:00
Eli Cohen
8ff095ec4b IB/mlx4: Add IPoIB checksum offload support
ConnectX devices support checksum generation and verification of TCP
and UDP packets for UD IPoIB messages.  This patch checks if the HCA
supports this and sets the IB_DEVICE_UD_IP_CSUM capability flag if it
does.  It implements support for handling the IB_SEND_IP_CSUM send
flag and setting the csum_ok field in receive work completions.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Ali Ayub <ali@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16 21:01:10 -07:00
Roland Dreier
37608eea86 mlx4_core: Fix confusion between mlx4_event and mlx4_dev_event enums
The struct mlx4_interface.event() method was supposed to get an enum
mlx4_dev_event, but the driver code was actually passing in the
hardware enum mlx4_event values.  Fix up the callers of
mlx4_dispatch_event() so that they pass in the right type of value,
and fix up the event method in mlx4_ib so that it can handle the enum
mlx4_dev_event values.

This eliminates the need for the subtype parameter to the event
method, so remove it.

This also fixes the sparse warning

    drivers/net/mlx4/intf.c:127:48: warning: mixing different enum types
    drivers/net/mlx4/intf.c:127:48:     int enum mlx4_event  versus
    drivers/net/mlx4/intf.c:127:48:     int enum mlx4_dev_event

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16 21:01:08 -07:00
Andy Fleming
c5e38a949b phy: Clean up header style
Multi-line comments weren't all CodingStyle compliant

Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-04-16 20:09:35 -04:00
Andy Fleming
9d9326d3bc phy: Change mii_bus id field to a string
Having the id field be an int was making more complex bus topologies
excessively difficult.  For now, just convert it to a string, and
change all instances of "bus->id = val" to
snprintf(id, MII_BUS_ID_LEN, "%x", val).

Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-04-16 20:09:35 -04:00
Jochen Friedrich
612212a3f2 [POWERPC] i2c: OF helpers for the i2c API
This implements various helpers to support OF bindings for the i2c
API.

Signed-off-by: Jochen Friedrich <jochen@scram.de>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-04-17 07:46:11 +10:00
Anton Vorontsov
863fbf4966 [POWERPC] OF helpers for the GPIO API
This implements various helpers to support OF bindings for the GPIO
LIB API.

Previously this was PowerPC specific, but it seems this code isn't
arch-dependent anyhow, so let's place it into of/.

SPARC will not see this addition yet, real hardware seem to not use
GPIOs at all. But this might change:

   http://www.leox.org/docs/faq_MLleon.html

"16-bit I/O port" sounds promising. :-)

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-04-17 07:46:11 +10:00
Linus Torvalds
6af74b03e0 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: update git url for blktrace
  io context: increment task attachment count in ioc_task_link()
2008-04-16 07:45:45 -07:00
Linus Torvalds
b4b8f57965 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  [TCP]: Add return value indication to tcp_prune_ofo_queue().
  PS3: gelic: fix the oops on the broken IE returned from the hypervisor
  b43legacy: fix DMA mapping leakage
  mac80211: remove message on receiving unexpected unencrypted frames
  Update rt2x00 MAINTAINERS entry
  Add rfkill to MAINTAINERS file
  rfkill: Fix device type check when toggling states
  b43legacy: Fix usage of struct device used for DMAing
  ssb: Fix usage of struct device used for DMAing
  MAINTAINERS: move to generic repository for iwlwifi
  b43legacy: fix initvals loading on bcm4303
  rtl8187: Add missing priv->vif assignments
  netconsole: only set CON_PRINTBUFFER if the user specifies a netconsole
  [CAN]: Update documentation of struct sockaddr_can
  MAINTAINERS: isdn4linux@listserv.isdn4linux.de is subscribers-only
  [TCP]: Fix never pruned tcp out-of-order queue.
  [NET_SCHED] sch_api: fix qdisc_tree_decrease_qlen() loop
2008-04-16 07:44:27 -07:00
Denis V. Lunev
f3005d7f4a [NETNS]: Add netns refcnt debug for network devices.
dev_set_net is called for
- just allocated devices
- devices moving from one namespace to another
release_net has proper check inside to distinguish these cases.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 02:02:18 -07:00
Pavel Emelyanov
a9fde26078 [VLAN]: Tag vlan_group_device with net device, not ifindex.
Currently vlan group is searched using one key - the ifindex.
We'll have to lookup the vlan_group by two keys - ifindex and
net. Turning the vlan_group lookup key to struct net_device
pointer will make this process easier.

Besides, this will eliminate one more place in the networking,
that assumes that indexes are unique in the kernel.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 00:48:04 -07:00
Krzysztof Helt
5f1a3f2ac4 acpi thermal trip points increased to 12
The THERMAL_MAX_TRIPS value is set to 10.  It is too few for the Compaq AP550
machine which has 12 trip points.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Len Brown <lenb@kernel.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-15 19:35:41 -07:00
Jan Kara
335e92e8a5 vfs: fix possible deadlock in ext2, ext3, ext4 when using xattrs
mb_cache_entry_alloc() was allocating cache entries with GFP_KERNEL.  But
filesystems are calling this function while holding xattr_sem so possible
recursion into the fs violates locking ordering of xattr_sem and transaction
start / i_mutex for ext2-4.  Change mb_cache_entry_alloc() so that filesystems
can specify desired gfp mask and use GFP_NOFS from all of them.

Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Dave Jones <davej@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-15 19:35:41 -07:00
Michael Buesch
4ac58469f1 ssb: Fix usage of struct device used for DMAing
This fixes DMA on architectures where DMA is nontrivial, like PPC64.
We must use the host-device's (PCI) struct device for any DMA
operation instead of the SSB device. For this we add a new
struct device pointer to the SSB device structure that will always
point to the right device for DMAing.

Without this patch b43 and b44 drivers won't work on complex-DMA
architectures, that for example need dev->archdata for DMA operations.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-15 15:04:35 -04:00
Martin Kebert
3e24e2b5ae Input: add Zhen Hua driver
This is a driver for Zhen Hua PPM-4CH RC transmitter (commonly used in cheap
Ready To Fly RC helicopters by Walkera) which using "Zhen Hua 5-byte protocol"
for using them as a four axis joystick via serial port.  Transmitter connected
to serial port (19200 8N1) sending periodically 5 bytes where first byte is for
synchronization and next four bytes are values of axis.

Signed-off-by: Martin Kebert <gkmarty@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2008-04-15 13:26:52 -04:00
David S. Miller
c50f68c8ae [LMB] Add lmb_alloc_nid()
A variant of lmb_alloc() that tries to allocate memory on a specified
NUMA node 'nid' but falls back to normal lmb_alloc() if that fails.

The caller provides a 'nid_range' function pointer which assists the
allocator.  It is given args 'start', 'end', and pointer to integer
'this_nid'.

It places at 'this_nid' the NUMA node id that corresponds to 'start',
and returns the end address within 'start' to 'end' at which memory
assosciated with 'nid' ends.

This callback allows a platform to use lmb_alloc_nid() in just
about any context, even ones in which early_pfn_to_nid() might
not be working yet.

This function will be used by the NUMA setup code on sparc64, and also
it can be used by powerpc, replacing it's hand crafted
"careful_allocation()" function in arch/powerpc/mm/numa.c

If x86 ever converts it's NUMA support over to using the LMB helpers,
it can use this too as it has something entirely similar.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-04-15 21:22:17 +10:00
Adrian Bunk
31efdf0530 [ISDN] include/linux/isdn.h: remove dead code
This patch remove the usage of a nonexisting kconfig variable.

Reported-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:30:16 -07:00
Adrian Bunk
99971e70fd [WANPIPE]: Forgotten bits of Sangoma drivers removal.
Robert P. J. Day spotted that my removal of the Sangoma drivers missed
a few bits.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:27:58 -07:00
Jens Axboe
d237e5c7ce io context: increment task attachment count in ioc_task_link()
Thanks to Nikanth Karthikesan <knikanth@suse.de> for reporting this.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-15 09:25:33 +02:00
Karl Dahlke
0beb4f6f29 Input: put ledstate in the keyboard notifier
Led state should be part of the key event, like shiftstate, and not
grabbed asynchronously after the fact.

[samuel.thibault@ens-lyon.org: various fixes]

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2008-04-15 01:30:32 -04:00
David Brownell
79966fd9b4 ARM: OMAP: I2C: tps65010 driver converts to gpiolib
Make the tps65010 driver use gpiolib to expose its GPIOs.

Note: This patch will get merged via omap tree instead of I2C
as it will cause some board updates. This has been discussed
at on the I2C list:

http://lists.lm-sensors.org/pipermail/i2c/2008-March/003031.html

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: i2c@lm-sensors.org
Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-04-14 09:57:06 -07:00
Christoph Lameter
0f389ec630 slub: No need for per node slab counters if !SLUB_DEBUG
The per node counters are used mainly for showing data through the sysfs API.
If that API is not compiled in then there is no point in keeping track of this
data. Disable counters for the number of slabs and the number of total slabs
if !SLUB_DEBUG. Incrementing the per node counters is also accessing a
potentially contended cacheline so this could actually be a performance
benefit to embedded systems.

SLABINFO support is also affected. It now must depends on SLUB_DEBUG (which
is on by default).

Patch also avoids a check for a NULL kmem_cache_node pointer in new_slab()
if the system is not compiled with NUMA support.

[penberg@cs.helsinki.fi: fix oops and move ->nr_slabs into CONFIG_SLUB_DEBUG]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2008-04-14 18:53:02 +03:00
Linus Torvalds
533bb8a4d7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (31 commits)
  [BRIDGE]: Fix crash in __ip_route_output_key with bridge netfilter
  [NETFILTER]: ipt_CLUSTERIP: fix race between clusterip_config_find_get and _entry_put
  [IPV6] ADDRCONF: Don't generate temporary address for ip6-ip6 interface.
  [IPV6] ADDRCONF: Ensure disabling multicast RS even if privacy extensions are disabled.
  [IPV6]: Use appropriate sock tclass setting for routing lookup.
  [IPV6]: IPv6 extension header structures need to be packed.
  [IPV6]: Fix ipv6 address fetching in raw6_icmp_error().
  [NET]: Return more appropriate error from eth_validate_addr().
  [ISDN]: Do not validate ISDN net device address prior to interface-up
  [NET]: Fix kernel-doc for skb_segment
  [SOCK] sk_stamp: should be initialized to ktime_set(-1L, 0)
  net: check for underlength tap writes
  net: make struct tun_struct private to tun.c
  [SCTP]: IPv4 vs IPv6 addresses mess in sctp_inet[6]addr_event.
  [SCTP]: Fix compiler warning about const qualifiers
  [SCTP]: Fix protocol violation when receiving an error lenght INIT-ACK
  [SCTP]: Add check for hmac_algo parameter in sctp_verify_param()
  [NET_SCHED] cls_u32: refcounting fix for u32_delete()
  [DCCP]: Fix skb->cb conflicts with IP
  [AX25]: Potential ax25_uid_assoc-s leaks on module unload.
  ...
2008-04-14 07:56:24 -07:00
Paul Mackerras
ac7c5353b1 Merge branch 'linux-2.6' 2008-04-14 21:11:02 +10:00
David S. Miller
334f8b2afd Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6.26 2008-04-14 03:50:43 -07:00
David S. Miller
df39e8ba56 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/ehea/ehea_main.c
	drivers/net/wireless/iwlwifi/Kconfig
	drivers/net/wireless/rt2x00/rt61pci.c
	net/ipv4/inet_timewait_sock.c
	net/ipv6/raw.c
	net/mac80211/ieee80211_sta.c
2008-04-14 02:30:23 -07:00
Peter Warasin
e7bfd0a1a6 [NETFILTER]: bridge: add ebt_nflog watcher
This patch adds the ebtables nflog watcher to the kernel in order to
allow ebtables log through the nfnetlink_log backend.

Signed-off-by: Peter Warasin <peter@endian.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:54 +02:00
Patrick McHardy
dd13b01036 [NETFILTER]: nf_nat: kill helper and seq_adjust hooks
Connection tracking helpers (specifically FTP) need to be called
before NAT sequence numbers adjustments are performed to be able
to compare them against previously seen ones. We've introduced
two new hooks around 2.6.11 to maintain this ordering when NAT
modules were changed to get called from conntrack helpers directly.

The cost of netfilter hooks is quite high and sequence number
adjustments are only rarely needed however. Add a RCU-protected
sequence number adjustment function pointer and call it from
IPv4 conntrack after calling the helper.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:52 +02:00
Patrick McHardy
2bc780499a [NETFILTER]: nf_conntrack: add DCCP protocol support
Add DCCP conntrack helper. Thanks to Gerrit Renker <gerrit@erg.abdn.ac.uk>
for review and testing.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:49 +02:00
Patrick McHardy
d63a650736 [NETFILTER]: Add partial checksum validation helper
Move the UDP-Lite conntrack checksum validation to a generic helper
similar to nf_checksum() and make it fall back to nf_checksum()
in case the full packet is to be checksummed and hardware checksums
are available. This is to be used by DCCP conntrack, which also
needs to verify partial checksums.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:49 +02:00
Jan Engelhardt
3bb0362d2f [NETFILTER]: remove arpt_(un)register_target indirection macros
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:44 +02:00
Jan Engelhardt
95eea855af [NETFILTER]: remove arpt_target indirection macro
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:43 +02:00
Jan Engelhardt
4abff0775d [NETFILTER]: remove arpt_table indirection macro
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:43 +02:00
Jan Engelhardt
5452e425ad [NETFILTER]: annotate {arp,ip,ip6,x}tables with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:35 +02:00
Jan Engelhardt
b9f61b1603 [NETFILTER]: xt_sctp: simplify xt_sctp.h
The use of xt_sctp.h flagged up -Wshadow warnings in userspace, which
prompted me to look at it and clean it up. Basic operations have been
directly replaced by library calls (memcpy, memset is both available
in the kernel and userspace, and usually faster than a self-made
loop). The is_set and is_clear functions now use a processing time
shortcut, too.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 09:56:04 +02:00
Alexey Dobriyan
666953df35 [NETFILTER]: ip_tables: per-netns FILTER/MANGLE/RAW tables for real
Commit 9335f047fe aka
"[NETFILTER]: ip_tables: per-netns FILTER, MANGLE, RAW"
added per-netns _view_ of iptables rules. They were shown to user, but
ignored by filtering code. Now that it's possible to at least ping loopback,
per-netns tables can affect filtering decisions.

netns is taken in case of
	PRE_ROUTING, LOCAL_IN -- from in device,
	POST_ROUTING, LOCAL_OUT -- from out device,
	FORWARD -- from in device which should be equal to out device's netns.
		   This code is relatively new, so BUG_ON was plugged.

Wrappers were added to a) keep code the same from CONFIG_NET_NS=n users
(overwhelming majority), b) consolidate code in one place -- similar
changes will be done in ipv6 and arp netfilter code.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 09:56:02 +02:00
Gerrit Renker
f5572855ec [SKB]: __skb_queue_tail = __skb_insert before
This expresses __skb_queue_tail() in terms of __skb_insert(),
using __skb_insert_before() as auxiliary function.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-14 00:05:28 -07:00
Gerrit Renker
7de6c03336 [SKB]: __skb_append = __skb_queue_after
This expresses __skb_append in terms of __skb_queue_after, exploiting that

  __skb_append(old, new, list) = __skb_queue_after(list, old, new).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-14 00:05:09 -07:00
Gerrit Renker
bf29927588 [SKB]: __skb_queue_after(prev) = __skb_insert(prev, prev->next)
By reordering, __skb_queue_after() is expressed in terms of __skb_insert().

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-14 00:04:51 -07:00
Gerrit Renker
f525c06d12 [SKB]: __skb_dequeue = skb_peek + __skb_unlink
By rearranging the order of declarations, __skb_dequeue() is expressed in terms of

 * skb_peek() and
 * __skb_unlink(),

thus in effect mirroring the analogue implementation of __skb_dequeue_tail().

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-14 00:04:12 -07:00
YOSHIFUJI Hideaki
7cd636fe9c [IPV6]: IPv6 extension header structures need to be packed.
struct ipv6_opt_hdr is the common structure for IPv6 extension
headers, and it is common to increment the pointer to get
the real content.  On the other hand, since the structure
consists only of 1-byte next-header field and 1-byte length
field, size of that structure depends on architecture; 2 or 4.
Add "packed" attribute to get 2.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 23:33:52 -07:00
YOSHIFUJI Hideaki
cee8947338 [IPV6] MROUTE: Do not call ipv6_find_idev() directly.
Since NETDEV_REGISTER notifier chain is responsible for creating
inet6_dev{}, we do not need to call ipv6_find_idev() directly here.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 23:21:16 -07:00
David S. Miller
6fb9114e4b Merge branch 'net-2.6.26-misc-20080412b' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-dev 2008-04-12 19:19:46 -07:00
Paul Moore
03e1ad7b5d LSM: Make the Labeled IPsec hooks more stack friendly
The xfrm_get_policy() and xfrm_add_pol_expire() put some rather large structs
on the stack to work around the LSM API.  This patch attempts to fix that
problem by changing the LSM API to require only the relevant "security"
pointers instead of the entire SPD entry; we do this for all of the
security_xfrm_policy*() functions to keep things consistent.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-12 19:07:52 -07:00
Rusty Russell
14daa02139 net: make struct tun_struct private to tun.c
There's no reason for this to be in the header, and it just hurts
recompile time.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Max Krasnyanskiy <maxk@qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-12 18:48:58 -07:00
YOSHIFUJI Hideaki
f3ee4010e8 [IPV6]: Define constants for link-local multicast addresses.
- Define link-local all-node / all-router multicast addresses.
- Remove ipv6_addr_all_nodes() and ipv6_addr_all_routers().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-12 13:43:19 +09:00
Linus Torvalds
14897e35fd Merge branch 'docs' of git://git.lwn.net/linux-2.6
* 'docs' of git://git.lwn.net/linux-2.6:
  Add additional examples in Documentation/spinlocks.txt
  Move sched-rt-group.txt to scheduler/
  Documentation: move rpc-cache.txt to filesystems/
  Documentation: move nfsroot.txt to filesystems/
  Spell out behavior of atomic_dec_and_lock() in kerneldoc
  Fix a typo in highres.txt
  Fixes to the seq_file document
  Fill out information on patch tags in SubmittingPatches
  Add the seq_file documentation
2008-04-11 13:24:16 -07:00
J. Bruce Fields
dc07e721a2 Spell out behavior of atomic_dec_and_lock() in kerneldoc
A little more detail here wouldn't hurt.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2008-04-11 13:17:46 -06:00
Heiko Carstens
b0fac02370 Fix "$(AS) -traditional" compile breakage caused by asmlinkage_protect
git commit 54a0151041 ("asmlinkage_protect
replaces prevent_tail_call") causes this build failure on s390:

    AS      arch/s390/kernel/entry64.o
  In file included from arch/s390/kernel/entry64.S:14:
  include/linux/linkage.h:34: error: syntax error in macro parameter list
  make[1]: *** [arch/s390/kernel/entry64.o] Error 1
  make: *** [arch/s390/kernel] Error 2

and some other architectures.  The reason is that some architectures add
the "-traditional" flag to the invocation of $(AS), which disables
variadic macro argument support.

So just surround the new define with an #ifndef __ASSEMBLY__ to prevent
any side effects on asm code.

Cc: Roland McGrath <roland@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-11 08:29:13 -07:00
Bjorn Helgaas
544451a1a3 pnp: increase number of devices supported per protocol
Increase the PNP "number of devices" limit.  We currently use an unsigned
char, which limits us to 256 devices per protocol.  This patch changes that to
an unsigned int.

Not all backends can take advantage of this: we limit ISAPNP to 10 devices in
isapnp_cfg_begin(), and PNPBIOS is limited to 256 devices because the BIOS
interfaces use a one-byte device node number.

But there is no limit on the number of PNPACPI devices we may have.  Large HP
Integrity machines have more than 256, which causes the current "unsigned char
number" to wrap around.  This causes errors like this:

    pnp: PnP ACPI init
    kobject_add failed for 00:00 with -EEXIST, don't try to register things with the same name in the same directory.

    Call Trace:
     [<a000000100010720>] show_stack+0x40/0xa0
     [<a0000001000107b0>] dump_stack+0x30/0x60
     [<a0000001001dbdf0>] kobject_add+0x290/0x2c0
     [<a0000001002bfd40>] device_add+0x160/0x860
     [<a0000001002c0470>] device_register+0x30/0x60
     [<a00000010026ba70>] __pnp_add_device+0x130/0x180
     [<a00000010026bb70>] pnp_add_device+0xb0/0xe0
     [<a0000001007f2730>] pnpacpi_add_device+0x510/0x5a0
     [<a0000001007f2810>] pnpacpi_add_device_handler+0x50/0x80

This patch increases the limit to fix this PNPACPI problem.  It should not
have any adverse effect on ISAPNP or PNPBIOS because their limits are still
enforced in the backends.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-11 08:06:44 -07:00
Linus Torvalds
d10d89ec78 Add commentary about the new "asmlinkage_protect()" macro
It's really a pretty ugly thing to need, and some day it will hopefully
be obviated by teaching gcc about the magic calling conventions for the
low-level system call code, but in the meantime we can at least add big
honking comments about why we need these insane and strange macros.

I took my comments from my version of the macro, but I ended up deciding
to just pick Roland's version of the actual code instead (with his
prettier syntax that uses vararg macros).  Thus the previous two commits
that actually implement it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-10 17:35:23 -07:00
Roland McGrath
54a0151041 asmlinkage_protect replaces prevent_tail_call
The prevent_tail_call() macro works around the problem of the compiler
clobbering argument words on the stack, which for asmlinkage functions
is the caller's (user's) struct pt_regs.  The tail/sibling-call
optimization is not the only way that the compiler can decide to use
stack argument words as scratch space, which we have to prevent.
Other optimizations can do it too.

Until we have new compiler support to make "asmlinkage" binding on the
compiler's own use of the stack argument frame, we have work around all
the manifestations of this issue that crop up.

More cases seem to be prevented by also keeping the incoming argument
variables live at the end of the function.  This makes their original
stack slots attractive places to leave those variables, so the compiler
tends not clobber them for something else.  It's still no guarantee, but
it handles some observed cases that prevent_tail_call() did not.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-10 17:28:26 -07:00
Patrick McHardy
4738c1db15 [SKFILTER]: Add SKF_ADF_NLATTR instruction
SKF_ADF_NLATTR searches for a netlink attribute, which avoids manually
parsing and walking attributes. It takes the offset at which to start
searching in the 'A' register and the attribute type in the 'X' register
and returns the offset in the 'A' register. When the attribute is not
found it returns zero.

A top-level attribute can be located using a filter like this
(example for nfnetlink, using struct nfgenmsg):

	...
	{
		/* A = offset of first attribute */
		.code	= BPF_LD | BPF_IMM,
		.k	= sizeof(struct nlmsghdr) + sizeof(struct nfgenmsg)
	},
	{
		/* X = CTA_PROTOINFO */
		.code	= BPF_LDX | BPF_IMM,
		.k	= CTA_PROTOINFO,
	},
	{
		/* A = netlink attribute offset */
		.code	= BPF_LD | BPF_B | BPF_ABS,
		.k	= SKF_AD_OFF + SKF_AD_NLATTR
	},
	{
		/* Exit if not found */
		.code   = BPF_JMP | BPF_JEQ | BPF_K,
		.k	= 0,
		.jt	= <error>
	},
	...

A nested attribute below the CTA_PROTOINFO attribute would then
be parsed like this:

	...
	{
		/* A += sizeof(struct nlattr) */
		.code	= BPF_ALU | BPF_ADD | BPF_K,
		.k	= sizeof(struct nlattr),
	},
	{
		/* X = CTA_PROTOINFO_TCP */
		.code	= BPF_LDX | BPF_IMM,
		.k	= CTA_PROTOINFO_TCP,
	},
	{
		/* A = netlink attribute offset */
		.code	= BPF_LD | BPF_B | BPF_ABS,
		.k	= SKF_AD_OFF + SKF_AD_NLATTR
	},
	...

The data of an attribute can be loaded into 'A' like this:

	...
	{
		/* X = A (attribute offset) */
		.code	= BPF_MISC | BPF_TAX,
	},
	{
		/* A = skb->data[X + k] */
		.code 	= BPF_LD | BPF_B | BPF_IND,
		.k	= sizeof(struct nlattr),
	},
	...

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-10 02:02:28 -07:00
Stephen Hemminger
43db6d65e0 socket: sk_filter deinline
The sk_filter function is too big to be inlined. This saves 2296 bytes
of text on allyesconfig.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-10 01:43:09 -07:00
Stephen Hemminger
b715631fad socket: sk_filter minor cleanups
Some minor style cleanups:
  * Move __KERNEL__ definitions to one place in filter.h
  * Use const for sk_filter_len
  * Line wrapping
  * Put EXPORT_SYMBOL next to function definition

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-10 01:33:47 -07:00
Michael Buesch
d625a29ba6 ssb: Add support for block-I/O
This adds support for block based I/O to SSB.
This is needed in order to efficiently support PIO data
transfers to the card.
The block-I/O support is only compiled, if it's selected by the
weird driver that needs it. So there's no overhead for sane devices.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-08 16:44:40 -04:00
Michael Buesch
8fe2b65a18 ssb: Turn suspend/resume upside down
Turn the SSB bus suspend mechanism upside down.
Instead of deciding by an internal reference count when to suspend/resume,
let the parent bus call us in their suspend/resume routine.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-08 15:05:57 -04:00
David S. Miller
8eefca4888 Merge branch 'net-2.6.26-isatap-20080403' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-dev 2008-04-08 02:33:36 -07:00
Rusty Russell
2557a933b7 virtio: remove overzealous BUG_ON.
The 'disable_cb' callback is designed as an optimization to tell the host
we don't need callbacks now.  As it is not reliable, the debug check is
overzealous: it can happen on two CPUs at the same time.  Document this.

Even if it were reliable, the virtio_net driver doesn't disable
callbacks on transmit so the START_USE/END_USE debugging reentrance
protection can be easily tripped even on UP.

Thanks to Balaji Rao for the bug report and testing.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
CC: Balaji Rao <balajirrao@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-07 13:14:22 -07:00
James Bottomley
2f3edc6936 [SCSI] transport_class: BUG if we can't release the attribute container
Every current transport class calls transport_container_release but
ignores the return value.  This is catastrophic if it returns an error
because the containers are part of a global list and the next action of
almost every transport class is to free the memory used by the
container.

Fix this by making transport_container_release a void, but making it BUG
if attribute_container_release returns an error ... this catches the
root cause of a system panic much earlier.  If we don't do this, we get
an eventual BUG when the attribute container list notices the corruption
caused by the freed memory it's still referencing.

Also made attribute_container_release __must_check as a reminder.

Cc: Greg KH <greg@kroah.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:19:10 -05:00
FUJITA Tomonori
b1adaf65ba [SCSI] block: add sg buffer copy helper functions
This patch adds new three helper functions to copy data between an SG
list and a linear buffer.

- sg_copy_from_buffer copies data from linear buffer to an SG list

- sg_copy_to_buffer copies data from an SG list to a linear buffer

When the APIs copy data from a linear buffer to an SG list,
flush_kernel_dcache_page is called. It's not necessary for everyone
but it's a no-op on most architectures and in general the API is not
used in performance critical path.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:15:45 -05:00
Kai Makisara
40f6b36c62 [SCSI] st: add option to use SILI in variable block reads
Add new option MT_ST_SILI to enable setting the SILI bit in reads in variable
block mode. If SILI is set, reading a block shorter than the byte count does
not result in CHECK CONDITION. The length of the block is determined using the
residual count from the HBA. Avoiding the REQUEST SENSE command for every
block speeds up some real applications considerably.

Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:15:39 -05:00
Josh Boyer
834d97d452 [POWERPC] Add of_device_is_available function
IEEE 1275 defined a standard "status" property to indicate the operational
status of a device.  The property has four possible values: okay, disabled,
fail, fail-xxx.  The absence of this property means the operational status
of the device is unknown or okay.

This adds a function called of_device_is_available that checks the state
of the status property of a device.  If the property is absent or set to
either "okay" or "ok", it returns 1.  Otherwise it returns 0.

Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-04-07 13:49:23 +10:00
Stelian Pop
8d855317fc atmel_usba_udc: move endpoint declarations into platform data.
The atmel_usba_udc driver is being used by several platforms and arches
(avr32 and at91 ATM), and each platform may have different endpoint
settings.

The patch below moves the endpoint declarations into the platform
data and make the necessary adjustments for AVR32 (improved by
Haavard Skinnemoen <hskinnemoen@atmel.com>).

Signed-off-by: Stelian Pop <stelian@popies.net>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
2008-04-06 17:15:08 -04:00
YOSHIFUJI Hideaki
12802d058a [IPV6]: Comment MRT6_xxx sockopts in include/linux/in6.h.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-05 22:33:40 +09:00
YOSHIFUJI Hideaki
14fb64e1f4 [IPV6] MROUTE: Support PIM-SM (SSM).
Based on ancient patch by Mickael Hoerdt
<hoerdt@clarinet.u-strasbg.fr>, which is available at
<http://www-r2.u-strasbg.fr/~hoerdt/dev/linux_ipv6_mforwarding/patch-linux-ipv6-mforwarding-0.1a>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-05 22:33:39 +09:00
YOSHIFUJI Hideaki
7bc570c8b4 [IPV6] MROUTE: Support multicast forwarding.
Based on ancient patch by Mickael Hoerdt
<hoerdt@clarinet.u-strasbg.fr>, which is available at
<http://www-r2.u-strasbg.fr/~hoerdt/dev/linux_ipv6_mforwarding/patch-linux-ipv6-mforwarding-0.1a>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-05 22:33:38 +09:00
Paul Menage
8bab8dded6 cgroups: add cgroup support for enabling controllers at boot time
The effects of cgroup_disable=foo are:

- foo isn't auto-mounted if you mount all cgroups in a single hierarchy
- foo isn't visible as an individually mountable subsystem

As a result there will only ever be one call to foo->create(), at init time;
all processes will stay in this group, and the group will never be mounted on
a visible hierarchy.  Any additional effects (e.g.  not allocating metadata)
are up to the foo subsystem.

This doesn't handle early_init subsystems (their "disabled" bit isn't set be,
but it could easily be extended to do so if any of the early_init systems
wanted it - I think it would just involve some nastier parameter processing
since it would occur before the command-line argument parser had been run.

Hugh said:

  Ballpark figures, I'm trying to get this question out rather than
  processing the exact numbers: CONFIG_CGROUP_MEM_RES_CTLR adds 15% overhead
  to the affected paths, booting with cgroup_disable=memory cuts that back to
  1% overhead (due to slightly bigger struct page).

  I'm no expert on distros, they may have no interest whatever in
  CONFIG_CGROUP_MEM_RES_CTLR=y; and the rest of us can easily build with or
  without it, or apply the cgroup_disable=memory patches.

Unix bench's execl test result on x86_64 was

== just after boot without mounting any cgroup fs.==
mem_cgorup=off : Execl Throughput       43.0     3150.1      732.6
mem_cgroup=on  : Execl Throughput       43.0     2932.6      682.0
==

[lizf@cn.fujitsu.com: fix boot option parsing]
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Paul Menage <menage@google.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Sudhir Kumar <skumar@linux.vnet.ibm.com>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-04 14:46:26 -07:00
Linus Torvalds
3a143125dd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
  x86: revert assign IRQs to hpet timer
  x86: tsc prevent time going backwards
  xen: Clear PG_pinned in release_{pt,pd}()
  xen: Do not pin/unpin PMD pages
  xen: refactor xen_{alloc,release}_{pt,pd}()
  x86, agpgart: scary messages are fortunately obsolete
  xen: fix grant table bug
  x86: fix breakage of vSMP irq operations
  x86: print message if nmi_watchdog=2 cannot be enabled
  x86: fix nmi_watchdog=2 on Pentium-D CPUs
2008-04-04 14:42:58 -07:00
Thomas Gleixner
5761d64b27 x86: revert assign IRQs to hpet timer
The commits:

commit 37a47db8d7
Author: Balaji Rao <balajirrao@gmail.com>
Date:   Wed Jan 30 13:30:03 2008 +0100

    x86: assign IRQs to HPET timers, fix

and

commit e3f37a54f6
Author: Balaji Rao <balajirrao@gmail.com>
Date:   Wed Jan 30 13:30:03 2008 +0100

    x86: assign IRQs to HPET timers

have been identified to cause a regression on some platforms due to
the assignement of legacy IRQs which makes the legacy devices
connected to those IRQs disfunctional.

Revert them.

This fixes http://bugzilla.kernel.org/show_bug.cgi?id=10382

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-04 18:36:49 +02:00
Tejun Heo
e52dcc4899 libata: ATA_12/16 doesn't fall into ATAPI_MISC
SAT passthrus don't really fit into ATAPI_MISC class.  SAT passthru
commands always transfer multiple of 512 bytes and variable length
response is not allowed.  This patch creates a separate category -
ATAPI_PASS_THRU - for these.

This fixes HSM violation on "hdparm -I".

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-04-04 02:43:36 -04:00
Tejun Heo
436d34b362 libata: uninline atapi_cmd_type()
Uninline atapi_cmd_type().  It doesn't really have to be inline and
more case will be added which need to access unexported libata
variable.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-04-04 02:43:35 -04:00
YOSHIFUJI Hideaki
80a9492a33 [IPV4] MROUTE: Adjust include files for user-space.
<linux/mroute.h> needs <linux/types.h>.
Avoid including <linux/in.h> in user-space, which conflicts with
standard <netinet/in.h>.
Add basic struct and constant in <linux/pim.h>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-04 10:44:42 +09:00
YOSHIFUJI Hideaki
2e8046271f [IPV4] MROUTE: Move PIM definitions to <linux/pim.h>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-04 10:44:42 +09:00
David S. Miller
3bb5da3837 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2008-04-03 14:33:42 -07:00
Denis V. Lunev
a4aa834a91 [NETNS]: Declare init_net even without CONFIG_NET defined.
This does not look good, but there is no other choice. The compilation
without CONFIG_NET is broken and can not be fixed with ease.

After that there is no need for the following commits:
1567ca7eec
3edf8fa5cc
2d38f9a4f8

Revert them.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-03 13:04:33 -07:00
David S. Miller
e1ec1b8ccd Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/s2io.c
2008-04-02 22:35:23 -07:00
YOSHIFUJI Hideaki
de357cc013 [IPV6] NDISC: Don't rely on node-type hint from L2 unless required.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-03 10:06:01 +09:00
YOSHIFUJI Hideaki
300aaeeaab [IPV6] SIT: Add SIOCGETPRL ioctl to get/dump PRL.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-03 10:06:00 +09:00
Templin, Fred L
fadf6bf060 [IPV6] SIT: Add PRL management for ISATAP.
This patch updates the Linux the Intra-Site Automatic Tunnel Addressing
Protocol (ISATAP) implementation. It places the ISATAP potential router
list (PRL) in the kernel and adds three new private ioctls for PRL
management.

[Add several changes of structure name, constant names etc. - yoshfuji]

Signed-off-by: Fred L. Templin <fred.l.templin@boeing.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-03 10:05:58 +09:00
Christian Borntraeger
dd135ebbd2 kvm: provide kvm.h for all architecture: fixes headers_install
Currently include/linux/kvm.h is not considered by make headers_install,
because Kbuild cannot handle " unifdef-$(CONFIG_FOO) += foo.h.  This problem
was introduced by

commit fb56dbb31c
Author: Avi Kivity <avi@qumranet.com>
Date:   Sun Dec 2 10:50:06 2007 +0200

    KVM: Export include/linux/kvm.h only if $ARCH actually supports KVM

    Currently, make headers_check barfs due to <asm/kvm.h>, which <linux/kvm.h>
    includes, not existing.  Rather than add a zillion <asm/kvm.h>s, export kvm.
    only if the arch actually supports it.

    Signed-off-by: Avi Kivity <avi@qumranet.com>

which makes this an 2.6.25 regression.

One way of solving the issue is to enhance Kbuild, but Avi and David conviced
me, that changing headers_install is not the way to go.  This patch changes
the definition for linux/kvm.h to unifdef-y.

If  unifdef-y is used for linux/kvm.h "make headers_check" will fail on all
architectures without asm/kvm.h.  Therefore, this patch also provides
asm/kvm.h on all architectures.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Avi Kivity <avi@qumranet.com>
Cc: Sam Ravnborg <sam@ravnborg.org
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-02 15:28:18 -07:00
Linus Torvalds
2f819ae881 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (45 commits)
  [VLAN]: Proc entry is not renamed when vlan device name changes.
  [IPV6]: Fix ICMP relookup error path dst leak
  [ATM] drivers/atm/iphase.c: compilation warning fix
  IPv6: do not create temporary adresses with too short preferred lifetime
  IPv6: only update the lifetime of the relevant temporary address
  bluetooth : __rfcomm_dlc_close lock fix
  bluetooth : use lockdep sub-classes for diffrent bluetooth protocol
  [ROSE/AX25] af_rose: rose_release() fix
  mac80211: correct use_short_preamble handling
  b43: Fix PCMCIA IRQ routing
  b43: Add DMA mapping failure messages
  mac80211: trigger ieee80211_sta_work after opening interface
  [LLC]: skb allocation size for responses
  [IP] UDP: Use SEQ_START_TOKEN.
  [NET]: Remove Documentation/networking/sk98lin.txt
  [ATM] atm/idt77252.c: Make 2 functions static
  [ATM]: Make atm/he.c:read_prom_byte() static
  [IPV6] MCAST: Ensure to check multicast listener(s).
  [LLC]: Kill llc_station_mac_sa symbol export.
  forcedeth: fix locking bug with netconsole
  ...
2008-04-02 07:46:18 -07:00
Dmitry Torokhov
45d09e1e09 Merge branch 'wm97xx' 2008-04-02 10:02:43 -04:00
Fabio Checconi
34e6bbf23c cfq-iosched: fix rcu freeing of cfq io contexts
SLAB_DESTROY_BY_RCU is not a direct substitute for normal call_rcu()
freeing, since it'll page freeing but NOT object freeing. So change
cfq to do the freeing on its own.

Signed-off-by: Fabio Checconi <fabio@gandalf.sssup.it>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-02 15:42:20 +02:00
Denis V. Lunev
c0f39322c3 [NETNS]: Do not include net/net_namespace.h from seq_file.h
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-02 00:10:28 -07:00