Commit Graph

1031 Commits

Author SHA1 Message Date
Patrick McHardy
2248bcfcd8 [NETFILTER]: Add support for permanent expectations
A permanent expectation exists until timeing out and can expect
multiple related connections.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 15:06:42 -07:00
David S. Miller
93c37f2921 [SERIAL]: Avoid 'statement with no effect' warnings.
When SUPPORT_SYSRQ is false, gcc can emit warnings for
the uart_handle_sysrq_char() that results.  Using an
empty inline returning zero kills the warning.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 13:57:08 -07:00
Linus Torvalds
f65e77693a Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6 2005-09-06 00:32:12 -07:00
James Bottomley
d856f1e337 [PATCH] klist: fix klist to have the same klist_add semantics as list_head
at the moment, the list_head semantics are

list_add(node, head)

whereas current klist semantics are

klist_add(head, node)

This is bound to cause confusion, and since klist is the newcomer, it
should follow the list_head semantics.

I also added missing include guards to klist.h

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 16:03:13 -07:00
Jean Delvare
77ae84554c [PATCH] I2C: Drop the I2C_ACK_TEST ioctl
Drop the I2C_ACK_TEST ioctl, which was commented out. It never really
existed (not after 1999 anyway), and there is no such thing as a ack
test on I2C/SMBus anyway.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:26:56 -07:00
4e0c64cfc1 Merge HEAD from gregkh@master.kernel.org:/pub/scm/linux/kernel/git/gregkh/i2c-2.6.git 2005-09-05 09:20:31 -07:00
Jean Delvare
fae91e72b7 [PATCH] I2C: Drop I2C_DEVNAME and i2c_clientname
I2C_DEVNAME and i2c_clientname were introduced in 2.5.68 [1] to help
media/video driver authors who wanted their code to be compatible with
both Linux 2.4 and 2.6. The cause of the incompatibility has gone since
[2], so I think we can get rid of them, as they tend to make the code
harder to read and longer to preprocess/compile for no more benefit.

I'd hope nobody seriously attempts to keep media/video driver compatible
across Linux trees anymore, BTW.

[1] http://marc.theaimsgroup.com/?l=linux-kernel&m=104930186524598&w=2
[2] http://www.linuxhq.com/kernel/v2.6/0-test3/include/linux/i2c.h

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:35 -07:00
Jean Delvare
020789e9cb [PATCH] I2C: Outdated i2c_adapter comment
Delete an outdated comment about i2c_algorithm.id being computed
from algo->id.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:33 -07:00
Jean Delvare
c2459cf257 [PATCH] I2C: Kill i2c_algorithm.id (7/7)
The I2C_ALGO_* constants have no more users, delete them. Also update
the comments in i2c-id.h so that they reflect the current state of the
file.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:33 -07:00
Jean Delvare
1684a98430 [PATCH] I2C: Kill i2c_algorithm.id (6/7)
In theory, there should be no more users of I2C_ALGO_* at this point.
However, it happens that several drivers were using I2C_ALGO_* for
adapter ids, so we need to correct these before we can get rid of all
the I2C_ALGO_* definitions.

Note that this also fixes a bug in media/video/tvaudio.c:

	/* don't attach on saa7146 based cards,
	   because dedicated drivers are used */
	if ((adap->id & I2C_ALGO_SAA7146))
		return 0;

This test was plain broken, as it would succeed for many more adapters
than just the saa7146: any those id would share at least one bit with
the saa7146 id. We are really lucky that the few other adapters we want
this driver to work with did not fulfill that condition.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:32 -07:00
Jean Delvare
c7a46533ff [PATCH] I2C: Kill i2c_algorithm.id (5/7)
Merge the algorithm id part (16 upper bits) of the i2c adapters ids
into the definition of the adapters ids directly. After that, we don't
need to OR both ids together for each i2c_adapter structure.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:31 -07:00
Jean Delvare
1d8b9e1bad [PATCH] I2C: Kill i2c_algorithm.id (4/7)
There are no more users of i2c_algorithm.id, so we can finally drop
this structure member.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:29 -07:00
Jean Delvare
e51cc6b3a3 [PATCH] I2C: Kill i2c_algorithm.id (2/7)
Use the adapter id rather than the algorithm id to detect the i2c-isa
pseudo-adapter. This saves one level of dereferencing, and the
algorithm ids will soon be gone anyway.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:28 -07:00
Jean Delvare
975185880d [PATCH] I2C: Kill i2c_algorithm.name (1/7)
The name member of the i2c_algorithm is never used, although all
drivers conscientiously fill it. We can drop it completely, this
structure doesn't need to have a name.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:27 -07:00
Jean Delvare
d0f282706d [PATCH] hwmon: hwmon vs i2c, second round (10/11)
I see very little reason why vid_from_reg is inlined. It is not
exactly short, its parameters are seldom known in advance, and it is
never called in speed critical areas. Uninlining it should cause
little performance loss if any, and saves a signficant space as well
as compilation time.

As suggested by Alexey Dobriyan, I am leaving vid_to_reg inline for now,
as it is short and has a single user so far.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:23 -07:00
Jean Delvare
ee70d3a333 [PATCH] hwmon: hwmon vs i2c, second round (09/11)
Delete DEFAULT_VRM from hwmon-vid.h, it has no more users.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:23 -07:00
Jean Delvare
303760b44a [PATCH] hwmon: hwmon vs i2c, second round (07/11)
The only part left in i2c-sensor is the VRM/VRD/VID handling code.
This is in no way related to i2c, so it doesn't belong there. Move
the code to hwmon, where it belongs.

Note that not all hardware monitoring drivers do VRM/VRD/VID
operations, so less drivers depend on hwmon-vid than there were
depending on i2c-sensor.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:22 -07:00
Jean Delvare
f4b5026120 [PATCH] hwmon: hwmon vs i2c, second round (06/11)
The only thing left in i2c-sensor.h are module parameter definition
macros. It's only an extension of what i2c.h offers, and this extension
is not sensors-specific. As a matter of fact, a few non-sensors drivers
use them. So we better merge them in i2c.h, and get rid of i2c-sensor.h
altogether.

Signed-off-by: Jean Delvare <khali@linux-fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:21 -07:00
Jean Delvare
96478ef3f3 [PATCH] hwmon: hwmon vs i2c, second round (05/11)
The i2c_detect function has no more user, delete it.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:20 -07:00
Jean Delvare
b78ec31582 [PATCH] hwmon: hwmon vs i2c, second round (03/11)
We now have two identical structures, i2c_address_data in i2c-sensor.h
and i2c_client_address_data in i2c.h. We can kill one of them, I choose
to keep the one in i2c.h as it makes more sense (this structure is not
specific to sensors.)

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:19 -07:00
Jean Delvare
ef8dec5d8b [PATCH] hwmon: hwmon vs i2c, second round (02/11)
The way i2c-sensor handles forced addresses could be optimized. It
defines a structure (i2c_force_data) to associate a module parameter
with a given kind value, but in fact this kind value is always the
index of the structure in each array it is used in. So this additional
value can be omitted, and still be deduced in the code handling these
arrays.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:18 -07:00
Jean Delvare
9fc6adfa9a [PATCH] hwmon: hwmon vs i2c, second round (01/11)
Add support for kind-forced addresses to i2c_probe, like i2c_detect
has for (essentially) hardware monitoring drivers.

Note that this change will slightly increase the size of the drivers
using I2C_CLIENT_INSMOD, with no immediate benefit. This is a
requirement if we want to merge i2c_probe and i2c_detect though, and
seems a reasonable price to pay in comparison with the previous
cleanups which saved much more than that (such as the i2c-isa cleanup
or the i2c address ranges removal.)

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:18 -07:00
Jean Delvare
53ae11b083 [PATCH] hwmon: move SENSORS_LIMIT to hwmon.h
Move SENSORS_LIMIT from i2c-sensor.h to hwmon.h, as it is in no way
related to i2c.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:17 -07:00
Jean Delvare
cdcb192197 [PATCH] I2C: inline i2c_adapter_id
We could inline i2c_adapter_id, as it is really, really short. Doing
so saves a few bytes both in i2c-core and in the drivers using this
function.

                                            before     after      diff
drivers/hwmon/adm1026.ko                     41344     41305       -39
drivers/hwmon/asb100.ko                      27325     27246       -79
drivers/hwmon/gl518sm.ko                     20824     20785       -39
drivers/hwmon/it87.ko                        26419     26380       -39
drivers/hwmon/lm78.ko                        21424     21385       -39
drivers/hwmon/lm85.ko                        41034     40939       -95
drivers/hwmon/w83781d.ko                     39561     39514       -47
drivers/hwmon/w83792d.ko                     32979     32932       -47
drivers/i2c/i2c-core.ko                      24708     24531      -177

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:15 -07:00
R.Marek@sh.cvut.cz
5563e27d3a [PATCH] I2C: W83792D driver 1/3
I would like to announce support for W83792D chip. This driver was developed
by Winbond Electronics Corp. I added sysfs attributes callbacks infrastructure
plus various code fixes and codingstyle cleanups. I would like to thank Winbond
for supporting free software.

This patch is against 2.6.13rc3 plus hwmon-class and hwmon-split.
Separate patch for documantation and hwmon class register will follow.

Signed-off-by: Rudolf Marek <r.marek@sh.cvut.cz>
Signed-off-by: Chunhao Huang <DZShen@Winbond.com.tw>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:13 -07:00
Jean Delvare
570aefc361 [PATCH] I2C: Separate non-i2c hwmon drivers from i2c-core (9/9)
Move the definitions of i2c_is_isa_client and i2c_is_isa_adapter from
i2c.h to i2c-isa.h. Only hybrid drivers still need them.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:12 -07:00
Jean Delvare
5071860aba [PATCH] I2C: Separate non-i2c hwmon drivers from i2c-core (7/9)
Kill normal_isa in header files, documentation and all chip drivers, as
it is no more used.

normal_i2c could be renamed to normal, but I decided not to do so at the
moment, so as to limit the number of changes. This might be done later
as part of the i2c_probe/i2c_detect merge.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:12 -07:00
Jean Delvare
400c455eaa [PATCH] I2C: Separate non-i2c hwmon drivers from i2c-core (2/9)
Convert i2c-isa from a dumb i2c_adapter into a pseudo i2c-core for ISA
hardware monitoring drivers. The isa i2c_adapter is no more registered
with i2c-core, drivers have to explicitely connect to it using the new
i2c_isa_{add,del}_driver interface.

At this point, all ISA chip drivers are useless, because they still
register with i2c-core in the hope i2c-isa is registered there as well,
but it isn't anymore.

The fake bus will be named i2c-9191 in sysfs. This is the number it
already had internally in various places, so it's not exactly new,
except that now the number is seen in userspace as well. This shouldn't
be a problem until someone really has 9192 I2C busses in a given system
;)

The fake bus will no more show in "i2cdetect -l", as it won't be seen by
i2c-dev anymore (not being registered with i2c-core), which is a good
thing, as i2cdetect/i2cdump/i2cset cannot operate on this fake bus
anyway.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:09 -07:00
Jean Delvare
efde723fda [PATCH] I2C: Separate non-i2c hwmon drivers from i2c-core (1/9)
Temporarily export a few structures and functions from i2c-core, because we
will soon need them in i2c-isa.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:09 -07:00
Mark M. Hoffman
1236441f38 [PATCH] I2C hwmon: hwmon sysfs class
This patch adds the sysfs class "hwmon" for use by hardware monitoring
(sensors) chip drivers.  It also fixes up the related Kconfig/Makefile
bits.

Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:07 -07:00
bgardner@wabtec.com
a61fc683ae [PATCH] I2C: add kobj_to_i2c_client
Move the inline function kobj_to_i2c_client() from max6875.c to i2c.h.

Signed-off-by: Ben Gardner <bgardner@wabtec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:05 -07:00
Linus Torvalds
94f8c66e5e Merge branch 'upstream' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev 2005-09-05 05:50:36 -07:00
Jeff Garzik
d0bd99299b /spare/repo/libata-dev branch 'iomap-try3' 2005-09-05 05:20:33 -04:00
Jeff Garzik
586a4ac509 /spare/repo/libata-dev branch 'master' 2005-09-05 05:16:50 -04:00
Linus Torvalds
da1f136c26 Merge master.kernel.org:/home/rmk/linux-2.6-mmc 2005-09-05 00:18:09 -07:00
Linus Torvalds
48467641bc Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2005-09-05 00:11:50 -07:00
Bodo Stroesser
1b38f0064e [PATCH] Uml support: add PTRACE_SYSEMU_SINGLESTEP option to i386
This patch implements the new ptrace option PTRACE_SYSEMU_SINGLESTEP, which
can be used by UML to singlestep a process: it will receive SINGLESTEP
interceptions for normal instructions and syscalls, but syscall execution will
be skipped just like with PTRACE_SYSEMU.

Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:20 -07:00
Laurent Vivier
ed75e8d580 [PATCH] UML Support - Ptrace: adds the host SYSEMU support, for UML and general usage
Jeff Dike <jdike@addtoit.com>,
      Paolo 'Blaisorblade' Giarrusso <blaisorblade_spam@yahoo.it>,
      Bodo Stroesser <bstroesser@fujitsu-siemens.com>

Adds a new ptrace(2) mode, called PTRACE_SYSEMU, resembling PTRACE_SYSCALL
except that the kernel does not execute the requested syscall; this is useful
to improve performance for virtual environments, like UML, which want to run
the syscall on their own.

In fact, using PTRACE_SYSCALL means stopping child execution twice, on entry
and on exit, and each time you also have two context switches; with SYSEMU you
avoid the 2nd stop and so save two context switches per syscall.

Also, some architectures don't have support in the host for changing the
syscall number via ptrace(), which is currently needed to skip syscall
execution (UML turns any syscall into getpid() to avoid it being executed on
the host).  Fixing that is hard, while SYSEMU is easier to implement.

* This version of the patch includes some suggestions of Jeff Dike to avoid
  adding any instructions to the syscall fast path, plus some other little
  changes, by myself, to make it work even when the syscall is executed with
  SYSENTER (but I'm unsure about them). It has been widely tested for quite a
  lot of time.

* Various fixed were included to handle the various switches between
  various states, i.e. when for instance a syscall entry is traced with one of
  PT_SYSCALL / _SYSEMU / _SINGLESTEP and another one is used on exit.
  Basically, this is done by remembering which one of them was used even after
  the call to ptrace_notify().

* We're combining TIF_SYSCALL_EMU with TIF_SYSCALL_TRACE or TIF_SINGLESTEP
  to make do_syscall_trace() notice that the current syscall was started with
  SYSEMU on entry, so that no notification ought to be done in the exit path;
  this is a bit of a hack, so this problem is solved in another way in next
  patches.

* Also, the effects of the patch:
"Ptrace - i386: fix Syscall Audit interaction with singlestep"
are cancelled; they are restored back in the last patch of this series.

Detailed descriptions of the patches doing this kind of processing follow (but
I've already summed everything up).

* Fix behaviour when changing interception kind #1.

  In do_syscall_trace(), we check the status of the TIF_SYSCALL_EMU flag
  only after doing the debugger notification; but the debugger might have
  changed the status of this flag because he continued execution with
  PTRACE_SYSCALL, so this is wrong.  This patch fixes it by saving the flag
  status before calling ptrace_notify().

* Fix behaviour when changing interception kind #2:
  avoid intercepting syscall on return when using SYSCALL again.

  A guest process switching from using PTRACE_SYSEMU to PTRACE_SYSCALL
  crashes.

  The problem is in arch/i386/kernel/entry.S.  The current SYSEMU patch
  inhibits the syscall-handler to be called, but does not prevent
  do_syscall_trace() to be called after this for syscall completion
  interception.

  The appended patch fixes this.  It reuses the flag TIF_SYSCALL_EMU to
  remember "we come from PTRACE_SYSEMU and now are in PTRACE_SYSCALL", since
  the flag is unused in the depicted situation.

* Fix behaviour when changing interception kind #3:
  avoid intercepting syscall on return when using SINGLESTEP.

  When testing 2.6.9 and the skas3.v6 patch, with my latest patch and had
  problems with singlestepping on UML in SKAS with SYSEMU.  It looped
  receiving SIGTRAPs without moving forward.  EIP of the traced process was
  the same for all SIGTRAPs.

What's missing is to handle switching from PTRACE_SYSCALL_EMU to
PTRACE_SINGLESTEP in a way very similar to what is done for the change from
PTRACE_SYSCALL_EMU to PTRACE_SYSCALL_TRACE.

I.e., after calling ptrace(PTRACE_SYSEMU), on the return path, the debugger is
notified and then wake ups the process; the syscall is executed (or skipped,
when do_syscall_trace() returns 0, i.e.  when using PTRACE_SYSEMU), and
do_syscall_trace() is called again.  Since we are on the return path of a
SYSEMU'd syscall, if the wake up is performed through ptrace(PTRACE_SYSCALL),
we must still avoid notifying the parent of the syscall exit.  Now, this
behaviour is extended even to resuming with PTRACE_SINGLESTEP.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:20 -07:00
Pavel Machek
ca078bae81 [PATCH] swsusp: switch pm_message_t to struct
This adds type-checking to pm_message_t, so that people can't confuse it
with int or u32.  It also allows us to fix "disk yoyo" during suspend (disk
spinning down/up/down).

[We've tried that before; since that cpufreq problems were fixed and I've
tried make allyes config and fixed resulting damage.]

Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Alexander Nyberg <alexn@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:16 -07:00
Matt Tolentino
7ae65fd334 [PATCH] x86: fix EFI memory map parsing
The memory descriptors that comprise the EFI memory map are not fixed in
stone such that the size could change in the future.  This uses the memory
descriptor size obtained from EFI to iterate over the memory map entries
during boot.  This enables the removal of an x86 specific pad (and ifdef)
in the EFI header.  I also couldn't stomach the broken up nature of the
function to put EFI runtime calls into virtual mode any longer so I fixed
that up a bit as well.

For reference, this patch only impacts x86.

Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:09 -07:00
Ralf Baechle
dc4ec916f6 [PATCH] MIPS Technologies PCI ID bits
- MIPS Denmark does no longer exist; the PCI vendor ID is now owned by
  MIPS Technologies.

- Add ID for SOC-it, MIPS's system controller.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:04 -07:00
Mark A. Greer
d01c08c9ae [PATCH] ppc32: mv64x60 updates & enhancements
Updates and enhancement to the ppc32 mv64x60 code:
- move code to get mem size from mem ctlr to bootwrapper
- address some errata in the mv64360 pic code
- some minor cleanups
- export one of the bridge's regs via sysfs so user daemon can watch for
  extraction events

Signed-off-by: Mark A. Greer <mgreer@mvista.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:00 -07:00
Martin Hicks
c07e02db76 [PATCH] VM: add page_state info to per-node meminfo
Add page_state info to the per-node meminfo file in sysfs.  This is mostly
just for informational purposes.

The lack of this information was brought up recently during a discussion
regarding pagecache clearing, and I put this patch together to test out one
of the suggestions.

It seems like interesting info to have, so I'm submitting the patch.

Signed-off-by: Martin Hicks <mort@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:49 -07:00
Chen, Kenneth W
0e5c9f39f6 [PATCH] remove hugetlb_clean_stale_pgtable() and fix huge_pte_alloc()
I don't think we need to call hugetlb_clean_stale_pgtable() anymore
in 2.6.13 because of the rework with free_pgtables().  It now collect
all the pte page at the time of munmap.  It used to only collect page
table pages when entire one pgd can be freed and left with staled pte
pages.  Not anymore with 2.6.13.  This function will never be called
and We should turn it into a BUG_ON.

I also spotted two problems here, not Adam's fault :-)
(1) in huge_pte_alloc(), it looks like a bug to me that pud is not
    checked before calling pmd_alloc()
(2) in hugetlb_clean_stale_pgtable(), it also missed a call to
    pmd_free_tlb.  I think a tlb flush is required to flush the mapping
    for the page table itself when we clear out the pmd pointing to a
    pte page.  However, since hugetlb_clean_stale_pgtable() is never
    called, so it won't trigger the bug.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Cc: Adam Litke <agl@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:46 -07:00
Deepak Saxena
fd195c49fb [PATCH] arm: allow for arch-specific IOREMAP_MAX_ORDER
Version 6 of the ARM architecture introduces the concept of 16MB pages
(supersections) and 36-bit (40-bit actually, but nobody uses this) physical
addresses.  36-bit addressed memory and I/O and ARMv6 can only be mapped
using supersections and the requirement on these is that both virtual and
physical addresses be 16MB aligned.  In trying to add support for ioremap()
of 36-bit I/O, we run into the issue that get_vm_area() allows for a
maximum of 512K alignment via the IOREMAP_MAX_ORDER constant.  To work
around this, we can:

- Allocate a larger VM area than needed (size + (1ul << IOREMAP_MAX_ORDER))
  and then align the pointer ourselves, but this ends up with 512K of
  wasted VM per ioremap().

- Provide a new __get_vm_area_aligned() API and make __get_vm_area() sit
  on top of this. I did this and it works but I don't like the idea
  adding another VM API just for this one case.

- My preferred solution which is to allow the architecture to override
  the IOREMAP_MAX_ORDER constant with it's own version.

Signed-off-by: Deepak Saxena <dsaxena@plexity.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:46 -07:00
Paolo 'Blaisorblade' Giarrusso
e83a959671 [PATCH] comment typo fix
smp_entry_t -> swap_entry_t

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:45 -07:00
Martin Hicks
bce5f6ba34 [PATCH] VM: add capabilites check to set_zone_reclaim
Add a capability check to sys_set_zone_reclaim().  This syscall is not
something that should be available to a user.

Signed-off-by:  Martin Hicks <mort@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:44 -07:00
Nick Piggin
242e546862 [PATCH] mm: remove atomic
This bitop does not need to be atomic because it is performed when there will
be no references to the page (ie.  the page is being freed).

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:44 -07:00
Christoph Lameter
6e21c8f145 [PATCH] /proc/<pid>/numa_maps to show on which nodes pages reside
This patch was recently discussed on linux-mm:
http://marc.theaimsgroup.com/?t=112085728500002&r=1&w=2

I inherited a large code base from Ray for page migration.  There was a
small patch in there that I find to be very useful since it allows the
display of the locality of the pages in use by a process.  I reworked that
patch and came up with a /proc/<pid>/numa_maps that gives more information
about the vma's of a process.  numa_maps is indexes by the start address
found in /proc/<pid>/maps.  F.e.  with this patch you can see the page use
of the "getty" process:

margin:/proc/12008 # cat maps
00000000-00004000 r--p 00000000 00:00 0
2000000000000000-200000000002c000 r-xp 00000000 08:04 516                /lib/ld-2.3.3.so
2000000000038000-2000000000040000 rw-p 00028000 08:04 516                /lib/ld-2.3.3.so
2000000000040000-2000000000044000 rw-p 2000000000040000 00:00 0
2000000000058000-2000000000260000 r-xp 00000000 08:04 54707842           /lib/tls/libc.so.6.1
2000000000260000-2000000000268000 ---p 00208000 08:04 54707842           /lib/tls/libc.so.6.1
2000000000268000-2000000000274000 rw-p 00200000 08:04 54707842           /lib/tls/libc.so.6.1
2000000000274000-2000000000280000 rw-p 2000000000274000 00:00 0
2000000000280000-20000000002b4000 r--p 00000000 08:04 9126923            /usr/lib/locale/en_US.utf8/LC_CTYPE
2000000000300000-2000000000308000 r--s 00000000 08:04 60071467           /usr/lib/gconv/gconv-modules.cache
2000000000318000-2000000000328000 rw-p 2000000000318000 00:00 0
4000000000000000-4000000000008000 r-xp 00000000 08:04 29576399           /sbin/mingetty
6000000000004000-6000000000008000 rw-p 00004000 08:04 29576399           /sbin/mingetty
6000000000008000-600000000002c000 rw-p 6000000000008000 00:00 0          [heap]
60000fff7fffc000-60000fff80000000 rw-p 60000fff7fffc000 00:00 0
60000ffffff44000-60000ffffff98000 rw-p 60000ffffff44000 00:00 0          [stack]
a000000000000000-a000000000020000 ---p 00000000 00:00 0                  [vdso]

cat numa_maps
2000000000000000 default MaxRef=43 Pages=11 Mapped=11 N0=4 N1=3 N2=2 N3=2
2000000000038000 default MaxRef=1 Pages=2 Mapped=2 Anon=2 N0=2
2000000000040000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
2000000000058000 default MaxRef=43 Pages=61 Mapped=61 N0=14 N1=15 N2=16 N3=16
2000000000268000 default MaxRef=1 Pages=2 Mapped=2 Anon=2 N0=2
2000000000274000 default MaxRef=1 Pages=3 Mapped=3 Anon=3 N0=3
2000000000280000 default MaxRef=8 Pages=3 Mapped=3 N0=3
2000000000300000 default MaxRef=8 Pages=2 Mapped=2 N0=2
2000000000318000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N2=1
4000000000000000 default MaxRef=6 Pages=2 Mapped=2 N1=2
6000000000004000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
6000000000008000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
60000fff7fffc000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
60000ffffff44000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1

getty uses ld.so.  The first vma is the code segment which is used by 43
other processes and the pages are evenly distributed over the 4 nodes.

The second vma is the process specific data portion for ld.so.  This is
only one page.

The display format is:

<startaddress>	 Links to information in /proc/<pid>/map
<memory policy>  This can be "default" "interleave={}", "prefer=<node>" or "bind={<zones>}"
MaxRef=		<maximum reference to a page in this vma>
Pages=		<Nr of pages in use>
Mapped=		<Nr of pages with mapcount >
Anon=		<nr of anonymous pages>
Nx=		<Nr of pages on Node x>

The content of the proc-file is self-evident.  If this would be tied into
the sparsemem system then the contents of this file would not be too
useful.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:43 -07:00
Hugh Dickins
5d337b9194 [PATCH] swap: swap_lock replace list+device
The idea of a swap_device_lock per device, and a swap_list_lock over them all,
is appealing; but in practice almost every holder of swap_device_lock must
already hold swap_list_lock, which defeats the purpose of the split.

The only exceptions have been swap_duplicate, valid_swaphandles and an
untrodden path in try_to_unuse (plus a few places added in this series).
valid_swaphandles doesn't show up high in profiles, but swap_duplicate does
demand attention.  However, with the hold time in get_swap_pages so much
reduced, I've not yet found a load and set of swap device priorities to show
even swap_duplicate benefitting from the split.  Certainly the split is mere
overhead in the common case of a single swap device.

So, replace swap_list_lock and swap_device_lock by spinlock_t swap_lock
(generally we seem to prefer an _ in the name, and not hide in a macro).

If someone can show a regression in swap_duplicate, then probably we should
add a hashlock for the swap_map entries alone (shorts being anatomic), so as
to help the case of the single swap device too.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:42 -07:00
Hugh Dickins
52b7efdbe5 [PATCH] swap: scan_swap_map drop swap_device_lock
get_swap_page has often shown up on latency traces, doing lengthy scans while
holding two spinlocks.  swap_list_lock is already dropped, now scan_swap_map
drop swap_device_lock before scanning the swap_map.

While scanning for an empty cluster, don't worry that racing tasks may
allocate what was free and free what was allocated; but when allocating an
entry, check it's still free after retaking the lock.  Avoid dropping the lock
in the expected common path.  No barriers beyond the locks, just let the
cookie crumble; highest_bit limit is volatile, but benign.

Guard against swapoff: must check SWP_WRITEOK before allocating, must raise
SWP_SCANNING reference count while in scan_swap_map, swapoff wait for that to
fall - just use schedule_timeout, we don't want to burden scan_swap_map
itself, and it's very unlikely that anyone can really still be in
scan_swap_map once swapoff gets this far.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:41 -07:00
Hugh Dickins
6eb396dc4a [PATCH] swap: swap unsigned int consistency
The swap header's unsigned int last_page determines the range of swap pages,
but swap_info has been using int or unsigned long in some cases: use unsigned
int throughout (except, in several places a local unsigned long is useful to
avoid overflows when adding).

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:41 -07:00
Hugh Dickins
53092a7402 [PATCH] swap: show span of swap extents
The "Adding %dk swap" message shows the number of swap extents, as a guide to
how fragmented the swapfile may be.  But a useful further guide is what total
extent they span across (sometimes scarily large).

And there's no need to keep nr_extents in swap_info: it's unused after the
initial message, so save a little space by keeping it on stack.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:40 -07:00
Hugh Dickins
11d31886db [PATCH] swap: swap extent list is ordered
There are several comments that swap's extent_list.prev points to the lowest
extent: that's not so, it's extent_list.next which points to it, as you'd
expect.  And a couple of loops in add_swap_extent which go all the way through
the list, when they should just add to the other end.

Fix those up, and let map_swap_page search the list forwards: profiles shows
it to be twice as quick that way - because prefetch works better on how the
structs are typically kmalloc'ed?  or because usually more is written to than
read from swap, and swap is allocated ascendingly?

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:40 -07:00
Dave Hansen
28ae55c98e [PATCH] sparsemem extreme: hotplug preparation
This splits up sparse_index_alloc() into two pieces.  This is needed
because we'll allocate the memory for the second level in a different place
from where we actually consume it to keep the allocation from happening
underneath a lock

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Bob Picco <bob.picco@hp.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:38 -07:00
Bob Picco
3e347261a8 [PATCH] sparsemem extreme implementation
With cleanups from Dave Hansen <haveblue@us.ibm.com>

SPARSEMEM_EXTREME makes mem_section a one dimensional array of pointers to
mem_sections.  This two level layout scheme is able to achieve smaller
memory requirements for SPARSEMEM with the tradeoff of an additional shift
and load when fetching the memory section.  The current SPARSEMEM
implementation is a one dimensional array of mem_sections which is the
default SPARSEMEM configuration.  The patch attempts isolates the
implementation details of the physical layout of the sparsemem section
array.

SPARSEMEM_EXTREME requires bootmem to be functioning at the time of
memory_present() calls.  This is not always feasible, so architectures
which do not need it may allocate everything statically by using
SPARSEMEM_STATIC.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Bob Picco <bob.picco@hp.com>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:38 -07:00
Bob Picco
802f192e4a [PATCH] SPARSEMEM EXTREME
A new option for SPARSEMEM is ARCH_SPARSEMEM_EXTREME.  Architecture
platforms with a very sparse physical address space would likely want to
select this option.  For those architecture platforms that don't select the
option, the code generated is equivalent to SPARSEMEM currently in -mm.
I'll be posting a patch on ia64 ml which uses this new SPARSEMEM feature.

ARCH_SPARSEMEM_EXTREME makes mem_section a one dimensional array of
pointers to mem_sections.  This two level layout scheme is able to achieve
smaller memory requirements for SPARSEMEM with the tradeoff of an
additional shift and load when fetching the memory section.  The current
SPARSEMEM -mm implementation is a one dimensional array of mem_sections
which is the default SPARSEMEM configuration.  The patch attempts isolates
the implementation details of the physical layout of the sparsemem section
array.

ARCH_SPARSEMEM_EXTREME depends on 64BIT and is by default boolean false.

I've boot tested under aim load ia64 configured for ARCH_SPARSEMEM_EXTREME.
 I've also boot tested a 4 way Opteron machine with !ARCH_SPARSEMEM_EXTREME
and tested with aim.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Bob Picco <bob.picco@hp.com>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:38 -07:00
Vojtech Pavlik
8a409b0118 Input: HID - add more consumer usages
Extend mapping of the consumer usage page in hid-input.c to handle
more cases appearing on new USB keyboards.

Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2005-09-05 00:08:08 -05:00
Pierre Ossman
865e9f13c9 [MMC] ios for mmc chip select
Adds a new ios for setting the chip select pin on MMC cards. Needed on
SD controllers which use this pin for other things and therefore cannot
have it pulled high at all times.

Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-09-03 16:45:02 +01:00
Len Brown
129521dcc9 Merge linux-2.6 into linux-acpi-2.6 test 2005-09-03 02:44:09 -04:00
Rolf Eike Beer
d51fe1be3f [PATCH] remove driverfs references from include/linux/cpu.h and net/sunrpc/rpc_pipe.c
This patch is against 2.6.10, but still applies cleanly. It's just
s/driverfs/sysfs/ in these two files.

Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-02 00:57:31 -07:00
Linus Torvalds
138307b475 Merge HEAD from master.kernel.org:/home/rmk/linux-2.6-serial 2005-09-02 00:53:36 -07:00
Herbert Xu
64baf3cfea [CRYPTO]: Added CRYPTO_TFM_REQ_MAY_SLEEP flag
The crypto layer currently uses in_atomic() to determine whether it is
allowed to sleep.  This is incorrect since spin locks don't always cause
in_atomic() to return true.

Instead of that, this patch returns to an earlier idea of a per-tfm flag
which determines whether sleeping is allowed.  Unlike the earlier version,
the default is to not allow sleeping.  This ensures that no existing code
can break.

As usual, this flag may either be set through crypto_alloc_tfm(), or
just before a specific crypto operation.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-01 17:43:05 -07:00
Mike Kershaw
ff4cc3ac93 [TUNTAP]: Allow setting the linktype of the tap device from userspace
Currently tun/tap only supports the EN10MB ARP type.  For use with
wireless and other networking types it should be possible to set the
ARP type via an ioctl.

Patch v2: Included check that the tap interface is down before changing the
link type out from underneath it

Signed-off-by: Mike Kershaw <dragorn@kismetwireless.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-01 17:40:05 -07:00
Jeff Garzik
e3ee3b78f8 /spare/repo/netdev-2.6 branch 'master' 2005-09-01 18:02:01 -04:00
Russell King
bc49a661e6 [SERIAL] Move serial8250_*_port prototypes to linux/serial_8250.h
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-09-01 15:56:26 +01:00
Sascha Hauer
0f302dc354 [ARM] 2866/1: add i.MX set_mctrl / get_mctrl functions
Patch from Sascha Hauer

This patch adds support for setting and getting RTS / CTS via
set_mtctrl / get_mctrl functions.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-08-31 21:48:47 +01:00
Russell King
b129a8ccd5 [SERIAL] Clean up and fix tty transmission start/stoping
The start_tx and stop_tx methods were passed a flag to indicate
whether the start/stop was from the tty start/stop callbacks, and
some drivers used this flag to decide whether to ask the UART to
immediately stop transmission (where the UART supports such a
feature.)

There are other cases when we wish this to occur - when CTS is
lowered, or if we change from soft to hard flow control and CTS
is inactive.  In these cases, this flag was false, and we would
allow the transmitter to drain before stopping.

There is really only one case where we want to let the transmitter
drain before disabling, and that's when we run out of characters
to send.

Hence, re-jig the start_tx and stop_tx methods to eliminate this
flag, and introduce new functions for the special "disable and
allow transmitter to drain" case.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-08-31 10:12:14 +01:00
James Bottomley
61a7afa2c4 [SCSI] embryonic RAID class
The idea behind a RAID class is to provide a uniform interface to all
RAID subsystems (both hardware and software) in the kernel.

To do that, I've made this class a transport class that's entirely
subsystem independent (although the matching routines have to match per
subsystem, as you'll see looking at the code).  I put it in the scsi
subdirectory purely because I needed somewhere to play with it, but it's
not a scsi specific module.

I used a fusion raid card as the test bed for this; with that kind of
card, this is the type of class output you get:

jejb@titanic> ls -l /sys/class/raid_devices/20\:0\:0\:0/
total 0
lrwxrwxrwx  1 root root     0 Aug 16 17:21 component-0 -> ../../../devices/pci0000:80/0000:80:04.0/host20/target20:1:0/20:1:0:0/
lrwxrwxrwx  1 root root     0 Aug 16 17:21 component-1 -> ../../../devices/pci0000:80/0000:80:04.0/host20/target20:1:1/20:1:1:0/
lrwxrwxrwx  1 root root     0 Aug 16 17:21 device -> ../../../devices/pci0000:80/0000:80:04.0/host20/target20:0:0/20:0:0:0/
-r--r--r--  1 root root 16384 Aug 16 17:21 level
-r--r--r--  1 root root 16384 Aug 16 17:21 resync
-r--r--r--  1 root root 16384 Aug 16 17:21 state

So it's really simple: for a SCSI device representing a hardware raid,
it shows the raid level, the array state, the resync % complete (if the
state is resyncing) and the underlying components of the RAID (these are
exposed in fusion on the virtual channel 1).

As you can see, this type of information can be exported by almost
anything, including software raid.

The more difficult trick, of course, is going to be getting it to
perform configuration type actions with writable attributes.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-08-30 22:48:51 -05:00
James Bottomley
53c165e0a6 [SCSI] correct attribute_container list usage
One of the changes in the attribute_container code in the scsi-misc tree
was to add a lock to protect the list of devices per container.  This,
unfortunately, leads to potential scheduling while atomic problems if
there's a sleep in the function called by a trigger.

The correct solution is to use the kernel klist infrastructure instead
which allows lockless traversal of a list.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-08-30 22:44:20 -05:00
Linus Torvalds
6b39374a27 Merge refs/heads/upstream from master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git 2005-08-30 11:16:30 -07:00
Jeff Garzik
ed735ccbef Merge HEAD from /spare/repo/linux-2.6/.git 2005-08-30 13:32:29 -04:00
Jeff Garzik
374b187357 [libata] update several drivers to use pci_iomap()/pci_iounmap() 2005-08-30 05:42:52 -04:00
Jeff Garzik
1623c81eec [libata] allow ATAPI to be enabled with new atapi_enabled module option
ATAPI is getting close to being ready.  To increase exposure, we enable
the code in the upstream kernel, but default it to off (present
behavior).  Users must pass atapi_enabled=1 as a module option (if
module) or on the kernel command line (if built in) to turn on
discovery of their ATAPI devices.
2005-08-30 03:37:42 -04:00
Takashi Iwai
d568121ce3 [PATCH] Assign device pointer to OSS devices
Add register_sound_special_device() function to allow assignment of
device pointer to a specific OSS device for HAL.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2005-08-30 08:58:37 +02:00
Linus Torvalds
8bc2bee26b Merge HEAD from master.kernel.org:/pub/scm/linux/kernel/git/paulus/ppc64-2.6 2005-08-29 21:44:33 -07:00
Stephen Rothwell
fb120da678 [PATCH] Make MODULE_DEVICE_TABLE work for vio devices
Make MODULE_DEVICE_TABLE work for vio devices.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-08-30 13:31:56 +10:00
Arnaldo Carvalho de Melo
a84ffe4303 [DCCP]: Introduce DCCP_SOCKOPT_PACKET_SIZE
So that applications can set dccp_sock->dccps_pkt_size, that in turn
is used in the CCID3 half connection init routines to set
ccid3hc[tr]x_s and use it in its rate calculations.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:13:37 -07:00
Harald Welte
0ac4f893f2 [NETFILTER6]: Add new ip6tables HOPLIMIT target
This target allows users to modify the hoplimit header field of the
IPv6 header.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:13:29 -07:00
Harald Welte
5f2c3b9107 [NETFILTER]: Add new iptables TTL target
This new iptables target allows manipulation of the TTL of an IPv4 packet.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:13:22 -07:00
Paul E. McKenney
cf4ef01440 [LIST]: Add docbook header comments for hlist_add_{before,after}_rcu()
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:11:00 -07:00
Alexey Dobriyan
57bf1451ac [NET]: net/802: more endian annotations
The rest of endian warnings now belongs to tr.c exclusively.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:10:54 -07:00
Robert Olsson
e5b4376074 [IPV4]: Prepare FIB core for RCU.
* RCU versions of hlist_***_rcu
* fib_alias partial rcu port just whats needed now.

Signed-off-by: Robert Olsson <Robert.Olsson@data.slu.se>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:08:31 -07:00
Patrick McHardy
05465343bf [NETFILTER]: Add goto target
Originally written by Henrik Nordstrom <hno@marasystems.com>, taken
from netfilter patch-o-matic and added ip6_tables support.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:04:18 -07:00
Patrick McHardy
764d8a9f24 [NETFILTER]: Add IPv6 REJECT target
Originally written by Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>,
taken from netfilter patch-o-matic and fixed up to work with current
kernels.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:04:12 -07:00
Pablo Neira Ayuso
7567662ba8 [NETFILTER]: Add string match
Signed-off-by: Pablo Neira Ayuso <pablo@eurodev.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:04:07 -07:00
Jon Wetzel
a6f9a70578 [NET]: Add support for getting the permanent hardware address.
This patch adds a new field to net device to hold the permanent
hardware address, and adds a new generic ethtool_op function to
get that address.

Signed-off-by: Jon Wetzel <jon_wetzel@dell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:02:44 -07:00
Ian McDonald
1bc0986957 [DCCP]: Fix the timestamp options
This changes timestamp, timestamp echo, and elapsed time to use units of 10
usecs as per DCCP spec. This has been tested to verify that times are correct.
Also fixed up length and used hton/ntoh more.

Still to add in later patches:
- actually use elapsed time to adjust RTT
(commented out as was prior to this patch)
- send options at times more closely following the spec
(content is now correct)

Signed-off-by: Ian McDonald <iam4@cs.waikato.ac.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:02:34 -07:00
David S. Miller
d179cd1292 [NET]: Implement SKB fast cloning.
Protocols that make extensive use of SKB cloning,
for example TCP, eat at least 2 allocations per
packet sent as a result.

To cut the kmalloc() count in half, we implement
a pre-allocation scheme wherein we allocate
2 sk_buff objects in advance, then use a simple
reference count to free up the memory at the
correct time.

Based upon an initial patch by Thomas Graf and
suggestions from Herbert Xu.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:01:54 -07:00
Arnaldo Carvalho de Melo
6ed8a48582 [NETLINK]: Fix sparse warnings
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:01:35 -07:00
Arnaldo Carvalho de Melo
20380731bc [NET]: Fix sparse warnings
Of this type, mostly:

CHECK   net/ipv6/netfilter.c
net/ipv6/netfilter.c:96:12: warning: symbol 'ipv6_netfilter_init' was not declared. Should it be static?
net/ipv6/netfilter.c:101:6: warning: symbol 'ipv6_netfilter_fini' was not declared. Should it be static?

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:01:32 -07:00
Patrick McHardy
066286071d [NETLINK]: Add "groups" argument to netlink_kernel_create
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:01:11 -07:00
Patrick McHardy
9a4595bc7e [NETLINK]: Add set/getsockopt options to support more than 32 groups
NETLINK_ADD_MEMBERSHIP/NETLINK_DROP_MEMBERSHIP are used to join/leave
groups, NETLINK_PKTINFO is used to enable nl_pktinfo control messages
for received packets to get the extended destination group number.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:01:07 -07:00
Patrick McHardy
ac6d439d20 [NETLINK]: Convert netlink users to use group numbers instead of bitmasks
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:00:54 -07:00
Patrick McHardy
d629b836d1 [NETLINK]: Use group numbers instead of bitmasks internally
Using the group number allows increasing the number of groups without
beeing limited by the size of the bitmask. It introduces one limitation
for netlink users: messages can't be broadcasted to multiple groups anymore,
however this feature was never used inside the kernel.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:00:49 -07:00
Patrick McHardy
db08052979 [NETLINK]: Remove unused groups member from struct netlink_skb_parms
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:00:39 -07:00
Domen Puncer
fb13ab2849 [NETFILTER]: Remove two unused files
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:58:29 -07:00
Patrick McHardy
a61bbcf28a [NET]: Store skb->timestamp as offset to a base timestamp
Reduces skb size by 8 bytes on 64-bit.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:58:24 -07:00
Patrick McHardy
25ed891019 [NETFILTER]: Nicer names for ipt_connbytes constants
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:58:17 -07:00
Harald Welte
9d810fd2d2 [NETFILTER]: Add new iptables "connbytes" match
This patch ads a new "connbytes" match that utilizes the CONFIG_NF_CT_ACCT
per-connection byte and packet counters.  Using it you can do things like
packet classification on average packet size within a connection.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:58:04 -07:00
Harald Welte
0ba2c6e8c0 [NETFILTER]: introduce and use aligned_u64 data type
As proposed by Andi Kleen, this is required esp. for x86_64 architecture,
where 64bit code needs 8byte aligned 64bit data types, but 32bit userspace
apps will only align to 4bytes.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:57:59 -07:00
Arnaldo Carvalho de Melo
a8c2190ee7 [INET_DIAG]: Rename tcp_diag.[ch] to inet_diag.[ch]
Next changeset will introduce net/ipv4/tcp_diag.c, moving the code that was put
transitioanlly in inet_diag.c.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:57:48 -07:00
Arnaldo Carvalho de Melo
73c1f4a033 [TCPDIAG]: Just rename everything to inet_diag
Next changeset will rename tcp_diag.[ch] to inet_diag.[ch].

I'm taking this longer route so as to easy review, making clear the changes
made all along the way.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:57:44 -07:00
Arnaldo Carvalho de Melo
4f5736c4c7 [TCPDIAG]: Introduce inet_diag_{register,unregister}
Next changeset will rename tcp_diag to inet_diag and move the tcp_diag code out
of it and into a new tcp_diag.c, similar to the net/dccp/diag.c introduced in
this changeset, completing the transition to a generic inet_diag
infrastructure.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:57:38 -07:00
Arnaldo Carvalho de Melo
505cbfc577 [IPV6]: Generalise the tcp_v6_lookup routines
In the same way as was done with the v4 counterparts, this will be moved
to inet6_hashtables.c.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:57:24 -07:00
Harald Welte
b766b305d3 [NETFILTER]: Fix gcc-3.4.x warning about iplicit operator precedence
Fix gcc-3.4.x warning about iplicit operator precedence in NF_QUEUE_NR()

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:57:14 -07:00
Denis Vlasenko
0a242efc4f [NET]: Deinline netif_carrier_{on,off}().
# grep -r 'netif_carrier_o[nf]' linux-2.6.12 | wc -l
246

# size vmlinux.org vmlinux.carrier
text    data     bss     dec     hex filename
4339634 1054414  259296 5653344  564360 vmlinux.org
4337710 1054414  259296 5651420  563bdc vmlinux.carrier

And this ain't an allyesconfig kernel!

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:57:08 -07:00
Harald Welte
5917ed961d [NETFILTER]: Fix NF_QUEUE_NR() macro
I obviously wanted to use bitwise-or, not logical or.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:57:03 -07:00
Arnaldo Carvalho de Melo
8c60f3fab5 [CCID3]: Separate most of the packet history code
This also changes the list_for_each_entry_safe_continue behaviour to match its
kerneldoc comment, that is, to start after the pos passed.

Also adds several helper functions from previously open coded fragments, making
the code more clear.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-08-29 15:56:28 -07:00
Arnaldo Carvalho de Melo
540722ffc3 [TCPDIAG]: Implement cheapest way of supporting DCCPDIAG_GETSOCK
With ugly ifdefs, etc, but this actually:

1. keeps the existing ABI, i.e. no need to recompile the iproute2
   utilities if not interested in DCCP.

2. Provides all the tcp_diag functionality in DCCP, with just a
   small patch that makes iproute2 support DCCP.

Of course I'll get this cleaned-up in time, but for now I think its
OK to be this way to quickly get this functionality.

iproute2-ss050808 patch at:

http://vger.kernel.org/~acme/iproute2-ss050808.dccp.patch

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:56:23 -07:00
Arnaldo Carvalho de Melo
6687e988d9 [ICSK]: Move TCP congestion avoidance members to icsk
This changeset basically moves tcp_sk()->{ca_ops,ca_state,etc} to inet_csk(),
minimal renaming/moving done in this changeset to ease review.

Most of it is just changes of struct tcp_sock * to struct sock * parameters.

With this we move to a state closer to two interesting goals:

1. Generalisation of net/ipv4/tcp_diag.c, becoming inet_diag.c, being used
   for any INET transport protocol that has struct inet_hashinfo and are
   derived from struct inet_connection_sock. Keeps the userspace API, that will
   just not display DCCP sockets, while newer versions of tools can support
   DCCP.

2. INET generic transport pluggable Congestion Avoidance infrastructure, using
   the current TCP CA infrastructure with DCCP.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:56:18 -07:00
Arnaldo Carvalho de Melo
64cf1e5d8b [DCCP]: Finish the TIMEWAIT minisock support
Using most of the infrastructure TCP uses, with a dccp_death_row,
etc. As per my current interpretation of the draft what we have with
this changeset seems to be all we need (or very close to it 8)).

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:56:03 -07:00
Harald Welte
1d3de414eb [NETFILTER]: New iptables DCCP protocol header match
Using this new iptables DCCP protocol header match, it is possible to
create simplistic stateless packet filtering rules for DCCP.  It
permits matching of port numbers, packet type and options.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:54:28 -07:00
Arnaldo Carvalho de Melo
e2e268665f [DCCP]: Fix struct sockaddr_dccp definition
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:54:23 -07:00
Harald Welte
5a47a470e6 [DCCP]: make <linux/dccp.h> include-able from userspace
The protocol header files in <linux/foo.h> are usually structured in a
way to be included by userspace code.  The top section consists of
general protocol structure definitions, typedefs, enums - followed by
an #ifdef __KERNEL__ section.

Currently <linux/dccp.h> doesn't follow that convention and can
therefore not be used from userspace.  However, for example iptables'
libipt_dccp.c actually needs various definitions from there.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:54:18 -07:00
Harald Welte
8a61fadb39 [NETFILTER]: check nf_log function call arguments
Check whether pf is too large in order to prevent array overflow.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:51:25 -07:00
Harald Welte
bbd86b9fc4 [NETFILTER]: add /proc/net/netfilter interface to nf_queue
This patch adds a /proc/net/netfilter/nf_queue file, similar to the
recently-added /proc/net/netfilter/nf_log.  It indicates which queue
handler is registered to which protocol family.  This is useful since
there are now multiple queue handlers in the treee (ip[6]_queue,
nfnetlink_queue).

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:51:18 -07:00
Harald Welte
fbcd923c3e [NETFILTER]: add correct bridging support to nfnetlink_{queue,log}
This patch adds support for passing the real 'physical' device ifindex
down to userspace via nfnetlink_log and nfnetlink_queue.

This feature basically obsoletes net/bridge/netfilter/ebt_ulog.c, and
it is likely ebt_ulog.c will die with one of the next couple of
patches.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:51:15 -07:00
Arnaldo Carvalho de Melo
74459dc7ba [LIST]: Introduce list_for_each_entry_safe_continue
Used in the dccp CCID3 code, that is going to be submitted RSN.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:50:04 -07:00
Arnaldo Carvalho de Melo
7c657876b6 [DCCP]: Initial implementation
Development to this point was done on a subversion repository at:

http://oops.ghostprotocols.net:81/cgi-bin/viewcvs.cgi/dccp-2.6/

This repository will be kept at this site for the foreseable future,
so that interested parties can see the history of this code,
attributions, etc.

If I ever decide to take this offline I'll provide the full history at
some other suitable place.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:49:46 -07:00
Arnaldo Carvalho de Melo
c4365c9235 [RANDOM]: Introduce secure_dccp_sequence_number
Code contributed by Stephen Hemminger.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:49:40 -07:00
Arnaldo Carvalho de Melo
295f7324ff [ICSK]: Introduce reqsk_queue_prune from code in tcp_synack_timer
With this we're very close to getting all of the current TCP
refactorings in my dccp-2.6 tree merged, next changeset will export
some functions needed by the current DCCP code and then dccp-2.6.git
will be born!

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:49:29 -07:00
Arnaldo Carvalho de Melo
3f421baa47 [NET]: Just move the inet_connection_sock function from tcp sources
Completing the previous changeset, this also generalises tcp_v4_synq_add,
renaming it to inet_csk_reqsk_queue_hash_add, already geing used in the
DCCP tree, which I plan to merge RSN.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:49:14 -07:00
Arnaldo Carvalho de Melo
463c84b97f [NET]: Introduce inet_connection_sock
This creates struct inet_connection_sock, moving members out of struct
tcp_sock that are shareable with other INET connection oriented
protocols, such as DCCP, that in my private tree already uses most of
these members.

The functions that operate on these members were renamed, using a
inet_csk_ prefix while not being moved yet to a new file, so as to
ease the review of these changes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:43:19 -07:00
Arnaldo Carvalho de Melo
8feaf0c0a5 [INET]: Generalise tcp_tw_bucket, aka TIME_WAIT sockets
This paves the way to generalise the rest of the sock ID lookup
routines and saves some bytes in TCPv4 TIME_WAIT sockets on distro
kernels (where IPv6 is always built as a module):

[root@qemu ~]# grep tw_sock /proc/slabinfo
tw_sock_TCPv6  0  0  128  31  1
tw_sock_TCP    0  0   96  41  1
[root@qemu ~]#

Now if a protocol wants to use the TIME_WAIT generic infrastructure it
only has to set the sk_prot->twsk_obj_size field with the size of its
inet_timewait_sock derived sock and proto_register will create
sk_prot->twsk_slab, for now its only for INET sockets, but we can
introduce timewait_sock later if some non INET transport protocolo
wants to use this stuff.

Next changesets will take advantage of this new infrastructure to
generalise even more TCP code.

[acme@toy net-2.6.14]$ grep built-in /tmp/before.size /tmp/after.size
/tmp/before.size: 188646   11764    5068  205478   322a6 net/ipv4/built-in.o
/tmp/after.size:  188144   11764    5068  204976   320b0 net/ipv4/built-in.o
[acme@toy net-2.6.14]$

Tested with both IPv4 & IPv6 (::1 (localhost) & ::ffff:172.20.0.1
(qemu host)).

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:42:13 -07:00
Arnaldo Carvalho de Melo
c752f0739f [TCP]: Move the tcp sock states to net/tcp_states.h
Lots of places just needs the states, not even linux/tcp.h, where this
enum was, needs it.

This speeds up development of the refactorings as less sources are
rebuilt when things get moved from net/tcp.h.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:41:54 -07:00
Harald Welte
1444fc559b [NETFILTER]: don't use nested attributes for conntrack_expect
We used to use nested nfattr structures for ip_conntrack_expect.  This is
bogus, since ip_conntrack and ip_conntrack_expect are communicated in
different netlink message types.  both should be encoded at the top level
attributes, no extra nesting required.  This patch addresses the issue.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:40:09 -07:00
Harald Welte
927ccbcc28 [NETFILTER]: attribute count is an attribute of message type, not subsytem
Prior to this patch, every nfnetlink subsystem had to specify it's
attribute count.  However, in reality the attribute count depends on
the message type within the subsystem, not the subsystem itself.  This
patch moves 'attr_count' from 'struct nfnetlink_subsys' into
nfnl_callback to fix this.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:39:14 -07:00
Patrick McHardy
a86888b925 [NETFILTER]: Fix multiple problems with the conntrack event cache
refcnt underflow: the reference count is decremented when a conntrack
entry is removed from the hash but it is not incremented when entering
new entries.

missing protection of process context against softirq context: all
cache operations need to locally disable softirqs to avoid races.
Additionally the event cache can't be initialized when a packet
enteres the conntrack code but needs to be initialized whenever we
cache an event and the stored conntrack entry doesn't match the
current one.

incorrect flushing of the event cache in ip_ct_iterate_cleanup:
without real locking we can't flush the cache for different CPUs
without incurring races. The cache for different CPUs can only be
flushed when no packets are going through the
code. ip_ct_iterate_cleanup doesn't need to drop all references, so
flushing is moved to the cleanup path.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:38:54 -07:00
Arnaldo Carvalho de Melo
a55ebcc4c4 [INET]: Move bind_hash from tcp_sk to inet_sk
This should really be in a inet_connection_sock, but I'm leaving it
for a later optimization, when some more fields common to INET
transport protocols now in tcp_sk or inet_sk will be chunked out into
inet_connection_sock, for now its better to concentrate on getting the
changes in the core merged to leave the DCCP tree with only DCCP
specific code.

Next changesets will take advantage of this move to generalise things
like tcp_bind_hash, tcp_put_port, tcp_inherit_port, making the later
receive a inet_hashinfo parameter, and even __tcp_tw_hashdance, etc in
the future, when tcp_tw_bucket gets transformed into the struct
timewait_sock hierarchy.

tcp_destroy_sock also is eligible as soon as tcp_orphan_count gets
moved to sk_prot.

A cascade of incremental changes will ultimately make the tcp_lookup
functions be fully generic.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:38:48 -07:00
Arnaldo Carvalho de Melo
0f7ff9274e [INET]: Just rename the TCP hashtable functions/structs to inet_
This is to break down the complexity of the series of patches,
making it very clear that this one just does:

1. renames tcp_ prefixed hashtable functions and data structures that
   were already mostly generic to inet_ to share it with DCCP and
   other INET transport protocols.

2. Removes not used functions (__tb_head & tb_head)

3. Removes some leftover prototypes in the headers (tcp_bucket_unlock &
   tcp_v4_build_header)

Next changesets will move tcp_sk(sk)->bind_hash to inet_sock so that we can
make functions such as tcp_inherit_port, __tcp_inherit_port, tcp_v4_get_port,
__tcp_put_port,  generic and get others like tcp_destroy_sock closer to generic
(tcp_orphan_count will go to sk->sk_prot to allow this).

Eventually most of these functions will be used passing the transport protocol
inet_hashinfo structure.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:38:32 -07:00
Harald Welte
0597f2680d [NETFILTER]: Add new "nfnetlink_log" userspace packet logging facility
This is a generic (layer3 independent) version of what ipt_ULOG is already
doing for IPv4 today.  ipt_ULOG, ebt_ulog and finally also ip[6]t_LOG will
be deprecated by this mechanism in the long term.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:38:12 -07:00
Harald Welte
608c8e4f7b [NETFILTER]: Extend netfilter logging API
This patch is in preparation to nfnetlink_log:
- loggers now have to register struct nf_logger instead of nf_logfn
- nf_log_unregister() replaced by nf_log_unregister_pf() and
  nf_log_unregister_logger()
- add comment to ip[6]t_LOG.h to assure nobody redefines flags
- add /proc/net/netfilter/nf_log to tell user which logger is currently
  registered for which address family
- if user has configured logging, but no logging backend (logger) is
  available, always spit a message to syslog, not just the first time.
- split ip[6]t_LOG.c into two parts:
  Backend: Always try to register as logger for the respective address family
  Frontend: Always log via nf_log_packet() API
- modify all users of nf_log_packet() to accomodate additional argument

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:38:07 -07:00
Harald Welte
838ab63649 [NETFILTER]: Add refcounting and /proc/net/netfilter interface to nfnetlink_queue
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:38:01 -07:00
Arnaldo Carvalho de Melo
32519f11d3 [INET]: Introduce inet_sk_rebuild_header
From tcp_v4_rebuild_header, that already was pretty generic, I only
needed to use sk->sk_protocol instead of the hardcoded IPPROTO_TCP and
establish the requirement that INET transport layer protocols that
want to use this function map TCP_SYN_SENT to its equivalent state.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:37:55 -07:00
Harald Welte
7af4cc3fa1 [NETFILTER]: Add "nfnetlink_queue" netfilter queue handler over nfnetlink
- Add new nfnetlink_queue module
- Add new ipt_NFQUEUE and ip6t_NFQUEUE modules to access queue numbers 1-65535
- Mark ip_queue and ip6_queue Kconfig options as OBSOLETE
- Update feature-removal-schedule to remove ip[6]_queue in December

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:36:56 -07:00
Harald Welte
0ab43f8499 [NETFILTER]: Core changes required by upcoming nfnetlink_queue code
- split netfiler verdict in 16bit verdict and 16bit queue number
- add 'queuenum' argument to nf_queue_outfn_t and its users ip[6]_queue
- move NFNL_SUBSYS_ definitions from enum to #define
- introduce autoloading for nfnetlink subsystem modules
- add MODULE_ALIAS_NFNL_SUBSYS macro
- add nf_unregister_queue_handlers() to register all handlers for a given
  nf_queue_outfn_t
- add more verbose DEBUGP macro definition to nfnetlink.c
- make nfnetlink_subsys_register fail if subsys already exists
- add some more comments and debug statements to nfnetlink.c

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:36:49 -07:00
Harald Welte
2cc7d57309 [NETFILTER]: Move reroute-after-queue code up to the nf_queue layer.
The rerouting functionality is required by the core, therefore it has
to be implemented by the core and not in individual queue handlers.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:36:19 -07:00
Harald Welte
4fdb3bb723 [NETLINK]: Add properly module refcounting for kernel netlink sockets.
- Remove bogus code for compiling netlink as module
- Add module refcounting support for modules implementing a netlink
  protocol
- Add support for autoloading modules that implement a netlink protocol
  as soon as someone opens a socket for that protocol

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:35:08 -07:00
Harald Welte
089af26c70 [NETFILTER]: Rename skb_ip_make_writable() to skb_make_writable()
There is nothing IPv4-specific in it.  In fact, it was already used by
IPv6, too...  Upcoming nfnetlink_queue code will use it for any kind
of packet.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:34:40 -07:00
David S. Miller
f2ccd8fa06 [NET]: Kill skb->real_dev
Bonding just wants the device before the skb_bond()
decapsulation occurs, so simply pass that original
device into packet_type->func() as an argument.

It remains to be seen whether we can use this same
exact thing to get rid of skb->input_dev as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:32:25 -07:00
Patrick McHardy
b6b99eb540 [NET]: Reduce tc_index/tc_verd to u16
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:32:20 -07:00
Harald Welte
080774a243 [NETFILTER]: Add ctnetlink subsystem
Add ctnetlink subsystem for userspace-access to ip_conntrack table.
This allows reading and updating of existing entries, as well as
creating new ones (and new expect's) via nfnetlink.

Please note the 'strange' byte order: nfattr (tag+length) are in host
byte order, while the payload is always guaranteed to be in network
byte order.  This allows a simple userspace process to encapsulate netlink
messages into arch-independent udp packets by just processing/swapping the
headers and not knowing anything about the actual payload.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:31:49 -07:00
Stephen Hemminger
6f1cf16582 [NET]: Remove HIPPI private from skbuff.h
This removes the private element from skbuff, that is only used by
HIPPI. Instead it uses skb->cb[] to hold the additional data that is
needed in the output path from hard_header to device driver.

PS: The only qdisc that might potentially corrupt this cb[] is if
netem was used over HIPPI. I will take care of that by fixing netem
to use skb->stamp. I don't expect many users of netem over HIPPI

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:31:42 -07:00
Harald Welte
f9e815b376 [NETFITLER]: Add nfnetlink layer.
Introduce "nfnetlink" (netfilter netlink) layer.  This layer is used as
transport layer for all userspace communication of the new upcoming
netfilter subsystems, such as ctnetlink, nfnetlink_queue and some day even
the mythical pkttables ;)

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:31:29 -07:00
Harald Welte
ac3247baf8 [NETFILTER]: connection tracking event notifiers
This adds a notifier chain based event mechanism for ip_conntrack state
changes.  As opposed to the previous implementations in patch-o-matic, we
do no longer need a field in the skb to achieve this.

Thanks to the valuable input from Patrick McHardy and Rusty on the idea
of a per_cpu implementation.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:31:24 -07:00
Patrick McHardy
abc3bc5804 [NET]: Kill skb->tc_classid
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:31:18 -07:00
David S. Miller
8728b834b2 [NET]: Kill skb->list
Remove the "list" member of struct sk_buff, as it is entirely
redundant.  All SKB list removal callers know which list the
SKB is on, so storing this in sk_buff does nothing other than
taking up some space.

Two tricky bits were SCTP, which I took care of, and two ATM
drivers which Francois Romieu <romieu@fr.zoreil.com> fixed
up.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
2005-08-29 15:31:14 -07:00
Harald Welte
6869c4d8e0 [NETFILTER]: reduce netfilter sk_buff enlargement
As discussed at netconf'05, we're trying to save every bit in sk_buff.
The patch below makes sk_buff 8 bytes smaller.  I did some basic
testing on my notebook and it seems to work.

The only real in-tree user of nfcache was IPVS, who only needs a
single bit.  Unfortunately I couldn't find some other free bit in
sk_buff to stuff that bit into, so I introduced a separate field for
them.  Maybe the IPVS guys can resolve that to further save space.

Initially I wanted to shrink pkt_type to three bits (PACKET_HOST and
alike are only 6 values defined), but unfortunately the bluetooth code
overloads pkt_type :(

The conntrack-event-api (out-of-tree) uses nfcache, but Rusty just
came up with a way how to do it without any skb fields, so it's safe
to remove it.

- remove all never-implemented 'nfcache' code
- don't have ipvs code abuse 'nfcache' field. currently get's their own
  compile-conditional skb->ipvs_property field.  IPVS maintainers can
  decide to move this bit elswhere, but nfcache needs to die.
- remove skb->nfcache field to save 4 bytes
- move skb->nfctinfo into three unused bits to save further 4 bytes

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:31:04 -07:00
Harald Welte
bf3a46aa9b [NETFILTER]: convert nfmark and conntrack mark to 32bit
As discussed at netconf'05, we convert nfmark and conntrack-mark to be
32bits even on 64bit architectures.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:29:31 -07:00
Len Brown
27a639a92d Auto-update from upstream 2005-08-29 17:02:17 -04:00
Jeff Garzik
c1b054d03f Merge /spare/repo/linux-2.6/ 2005-08-29 16:40:27 -04:00
Jeff Garzik
70d374ea99 Merge /spare/repo/linux-2.6/ 2005-08-29 15:59:42 -04:00
Al Viro
9e2d3cd34a [PATCH] mod_devicetable.h fixes
* ieee1394_device_id has kernel_ulong_t field after an odd number of
   __u32 ones.  Since mod_devicetable.h is included both from kernel and
   from host build helper, we may be in trouble if we are building on
   32bit host for 64bit target - userland sees unsigned long long,
   kernel sees unsigned long and while their sizes match, alignments
   might not.  Fixed by forcing alignment.  Fortunately, almost nobody
   else needs that - the rest of such fields is naturally aligned as it
   is.

 * of_device_id has void * in it.  Host userland helpers need
   kernel_ulong_t instead, since their void * might have nothing to do
   with the kernel one.  Fixed in the same way it's done for similar
   problems in pcmcia_device_id (ifdef __KERNEL__).

 * pcmcia_device_id has the same problem as ieee1394_device_id.  Fixed
   the same way.

Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-29 10:42:39 -07:00
Linus Torvalds
9ab7486e44 Merge HEAD from master.kernel.org:/home/rmk/linux-2.6-mmc.git 2005-08-29 10:35:21 -07:00
Linus Torvalds
975f957dc4 Merge HEAD from master.kernel.org:/home/rmk/linux-2.6-serial.git 2005-08-29 10:34:59 -07:00
Linus Torvalds
3d963f5bb1 Merge refs/heads/upstream from master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2005-08-29 10:04:37 -07:00
Andy Fleming
e13934563d [PATCH] PHY Layer fixup
This patch adds back the code that was taken out, thus re-enabling:

* The PHY Layer to initialize without crashing
* Drivers to actually connect to PHYs
* The entire PHY Control Layer

This patch is used by the gianfar driver, and other drivers which are in
development.

Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-28 20:28:25 -04:00
Jeff Garzik
af36d7f0df [libata] license change, other bits
- changes license of all code from OSL+GPL to plain ole GPL
  - except for NVIDIA, who hasn't yet responded about sata_nv
  - copyright holders were already contacted privately

- adds info in each driver about where hardware/protocol docs may be
  obtained

- where I have made major contributions, updated copyright dates
2005-08-28 20:18:39 -04:00
James Bottomley
7a93aef7fb Merge HEAD from ../scsi-misc-2.6-tmp 2005-08-28 11:18:35 -05:00
James Bottomley
caca177987 [SCSI] add missing attribute container function prototype
attribute_container_classdev_to_container is an exported function of the
attribute_container.c file.  However, there's no prototype for it.  Now
I actually want to use it, so add one.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-08-28 11:14:09 -05:00
James Bottomley
31151ba2ce fix mismerge in ll_rw_blk.c 2005-08-28 10:43:07 -05:00
David Woodhouse
efda945204 Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2005-08-27 14:30:07 +02:00
Jeff Garzik
d18d36b4ed libata: fix a few alan-isms 2005-08-27 04:13:52 -04:00
Alan Cox
b73fc89f6d [PATCH] libata: regularize dma_start/stop arguments
Needed for a few PATA drivers.

Also fix up a wrong comment.
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-26 17:36:26 -04:00
Len Brown
6153df7b2f [ACPI] delete CONFIG_ACPI_PCI
Delete the ability to build an ACPI kernel that does
not include PCI support.  When such a machine is created
and it requires a tuned kernel, send a patch.

http://bugzilla.kernel.org/show_bug.cgi?id=1364

Signed-off-by: Len Brown <len.brown@intel.com>
2005-08-25 12:40:44 -04:00
Len Brown
8466361ad5 [ACPI] delete CONFIG_ACPI_INTERPRETER
it is a synonym for CONFIG_ACPI

Signed-off-by: Len Brown <len.brown@intel.com>
2005-08-24 12:10:43 -04:00
Len Brown
888ba6c62b [ACPI] delete CONFIG_ACPI_BOOT
it has been a synonym for CONFIG_ACPI since 2.6.12

Signed-off-by: Len Brown <len.brown@intel.com>
2005-08-24 12:08:54 -04:00
Jeff Garzik
75a95178da Merge upstream into 'upstream' branch of netdev-2.6.git.
Hand fix merge conflict in drivers/net/tokenring/Kconfig.
2005-08-24 01:03:34 -04:00
Jeff Garzik
b2382b363d Merge upstream into ieee80211.
Hand-fix merge conflict in drivers/usb/net/zd1201.c.
2005-08-24 01:02:04 -04:00
Len Brown
84ffa74752 Merge from-linus to-akpm 2005-08-23 22:12:23 -04:00
Jeff Garzik
1410b0a7ad Merge /spare/repo/linux-2.6/ 2005-08-23 01:07:10 -04:00
Tejun Heo
c138950371 [PATCH] fix atapi_packet_task vs. intr race (take 2)
Interrupts from devices sharing the same IRQ could cause
ata_host_intr to finish commands being processed by atapi_packet_task
if the commands are using ATA_PROT_ATAPI_NODATA or ATA_PROT_ATAPI_DMA
protocol.  This is because libata interrupt handler is unaware that
interrupts are not expected during that period.  This patch adds
ATA_FLAG_NOINTR flag to tell the interrupt handler that we're not
expecting interrupts.

 Note that once proper HSM is implemented for interrupt-driven PIO,
this should be merged into it and this flag will be removed.

 ahci.c is a different kind of beast, so it's left alone.

* The following drivers use ata_qc_issue_prot and ata_interrupt, so
  changes in libata core will do.

  ata_piix sata_sil sata_svw sata_via sata_sis sata_uli

* The following drivers use ata_qc_issue_prot and custom intr handler.
  They need this change to work correctly.

  sata_nv sata_vsc

* The following drivers use custom issue function and intr handler.
  Currently all custom issue functions don't support ATAPI, so this
  change is irrelevant, updated for consistency and to avoid later
  mistakes.

  sata_promise sata_qstor sata_sx4

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-23 01:05:55 -04:00
Linus Torvalds
cc314eef01 Fix nasty ncpfs symlink handling bug.
This bug could cause oopses and page state corruption, because ncpfs
used the generic page-cache symlink handlign functions.  But those
functions only work if the page cache is guaranteed to be "stable", ie a
page that was installed when the symlink walk was started has to still
be installed in the page cache at the end of the walk.

We could have fixed ncpfs to not use the generic helper routines, but it
is in many ways much cleaner to instead improve on the symlink walking
helper routines so that they don't require that absolute stability.

We do this by allowing "follow_link()" to return a error-pointer as a
cookie, which is fed back to the cleanup "put_link()" routine.  This
also simplifies NFS symlink handling.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-19 18:02:56 -07:00
Russell King
dce7737718 [MMC] Use an IDR for host name indicies
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-08-19 09:42:52 +01:00
Russell King
1ad434d7cf [MMC] Use class device name for mmc host name
There's no point in having the host name duplicated between
the mmc_host structure and the encapsulated class device
structure.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-08-19 09:42:21 +01:00
Russell King
00b137cfda [MMC] Add MMC class devices
Create a mmc_host class to allow enumeration of MMC host controllers
even though they have no card(s) inserted.

Patch based on work by Pierre Ossman.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-08-19 09:41:24 +01:00
Russell King
d366b64363 [MMC] Add mmc_hostname() macro
mmc_hostname() returns a pointer to the hostname for the mmc_host.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-08-19 09:40:08 +01:00
Jeff Garzik
a3bc068022 Merge /spare/repo/linux-2.6/ 2005-08-18 22:14:39 -04:00
Linus Torvalds
91aa9fb573 Merge master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6 2005-08-18 15:16:12 -07:00
Narendra Sankar
84f57fbc72 [PATCH] serverworks: add support for new southbridge IDE
BCM5785 (HT1000) is a Opteron Southbridge from Serverworks/Broadcom that
incorporates a single channel ATA100 IDE controller that is functionally
identical to the Serverworks CSB6 IDE controller.  This patch adds support
for the new PCI device ID and also the support for this controller.

Signed-off-by: Narendra Sankar <nsankar@broadcom.com>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
2005-08-18 22:30:35 +02:00
Matt Gillette
2f09a7f4af [PATCH] ide: add support for Netcell Revolution to pci-ide generic driver
Adds support for Netcell Revolution to pci-ide generic driver by including
it in the list of devices matched.  Includes the Revolution in the list of
simplex devices forced into DMA mode.

Signed-off-by: Matt Gillette <matt.gillette@netcell.com>
Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl>
Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
2005-08-18 22:27:07 +02:00
Grant Coady
b07e5eccaf [PATCH] ide: fix PCI_DEVIEC_ID_APPLE_UNI_N_ATA spelling
Signed-off-by: Grant Coady <gcoady@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
2005-08-18 22:19:55 +02:00
Chuck Lever
dc59250c6e [PATCH] NFS: Introduce the use of inode->i_lock to protect fields in nfsi
Down the road we want to eliminate the use of the global kernel lock entirely
from the NFS client.  To do this, we need to protect the fields in the
nfs_inode structure adequately.  Start by serializing updates to the
"cache_validity" field.

Note this change addresses an SMP hang found by njw@osdl.org, where processes
deadlock because nfs_end_data_update and nfs_revalidate_mapping update the
"cache_validity" field without proper serialization.

Test plan:
 Millions of fsx ops on SMP clients.  Run Nick Wilson's breaknfs program on
 large SMP clients.

Signed-off-by: Chuck Lever <cel@netapp.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-18 12:53:57 -07:00
Chuck Lever
412d582ec1 [PATCH] NFS: use atomic bitops to manipulate flags in nfsi->flags
Introduce atomic bitops to manipulate the bits in the nfs_inode structure's
"flags" field.

Using bitops means we can use a generic wait_on_bit call instead of an ad hoc
locking scheme in fs/nfs/inode.c, so we can remove the "nfs_i_wait" field from
nfs_inode at the same time.

The other new flags field will continue to use bitmask and logic AND and OR.
This permits several flags to be set at the same time efficiently.  The
following patch adds a spin lock to protect these flags, and this spin lock
will later cover other fields in the nfs_inode structure, amortizing the cost
of using this type of serialization.

Test plan:
 Millions of fsx ops on SMP clients.

Signed-off-by: Chuck Lever <cel@netapp.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-18 12:53:56 -07:00
Chuck Lever
5529680981 [PATCH] NFS: split nfsi->flags into two fields
Certain bits in nfsi->flags can be manipulated with atomic bitops, and some
are better manipulated via logical bitmask operations.

This patch splits the flags field into two.  The next patch introduces atomic
bitops for one of the fields.

Test plan:
 Millions of fsx ops on SMP clients.

Signed-off-by: Chuck Lever <cel@netapp.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-18 12:53:56 -07:00
David Woodhouse
327b6b08d6 Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2005-08-17 14:37:55 +01:00
Kristen Accardi
4602b88d97 [PATCH] PCI: 6700/6702PXH quirk
On the 6700/6702 PXH part, a MSI may get corrupted if an ACPI hotplug
driver and SHPC driver in MSI mode are used together.

This patch will prevent MSI from being enabled for the SHPC as part of
an early pci quirk, as well as on any pci device which sets the no_msi
bit.

Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-16 21:06:24 -07:00
Trond Myklebust
65e4308d25 [PATCH] NFS: Ensure we always update inode->i_mode when doing O_EXCL creates
When the client performs an exclusive create and opens the file for writing,
a Netapp filer will first create the file using the mode 01777. It does this
since an NFSv3/v4 exclusive create cannot immediately set the mode bits.
The 01777 mode then gets put into the inode->i_mode. After the file creation
is successful, we then do a setattr to change the mode to the correct value
(as per the NFS spec).

The problem is that nfs_refresh_inode() no longer updates inode->i_mode, so
the latter retains the 01777 mode. A bit later, the VFS notices this, and calls
remove_suid(). This of course now resets the file mode to inode->i_mode & 0777.
Hey presto, the file mode on the server is now magically changed to 0777. Duh...

Fixes http://bugzilla.linux-nfs.org/show_bug.cgi?id=32

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-16 09:30:58 -07:00
Trond Myklebust
58fcb8df0b [PATCH] NFS: Ensure ACL xdr code doesn't overflow.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-16 08:52:11 -07:00
Len Brown
09d9200271 Merge from-linus to-akpm 2005-08-15 16:07:26 -04:00
John McCutchan
89204c40a0 [PATCH] inotify: add MOVE_SELF event
This adds a MOVE_SELF event to inotify.  It is sent whenever the inode
you are watching is moved.  We need this event so that we can catch
something like this:

 - app1:
	watch /etc/mtab

 - app2:
	cp /etc/mtab /tmp/mtab-work
	mv /etc/mtab /etc/mtab~
	mv /tmp/mtab-work /etc/mtab

app1 still thinks it's watching /etc/mtab but it's actually watching
/etc/mtab~.

Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-15 09:50:31 -07:00
Jeff Garzik
4c0e176dd5 Merge /spare/repo/linux-2.6/ 2005-08-14 23:10:00 -04:00
James Bottomley
d0a7e57400 [SCSI] correct transport class abstraction to work outside SCSI
I recently tried to construct a totally generic transport class and
found there were certain features missing from the current abstract
transport class.  Most notable is that you have to hang the data on the
class_device but most of the API is framed in terms of the generic
device, not the class_device.

These changes are two fold

- Provide the class_device to all of the setup and configure APIs
- Provide and extra API to take the device and the attribute class and
  return the corresponding class_device

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-08-14 17:21:27 -05:00
Matt Mackall
53fb95d3c1 [NETPOLL]: fix initialization/NAPI race
This fixes a race during initialization with the NAPI softirq
processing by using an RCU approach.

This race was discovered when refill_skbs() was added to
the setup code.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-11 19:27:43 -07:00
Matt Mackall
0db1d6fc1e [NETPOLL]: add retry timeout
Add limited retry logic to netpoll_send_skb

Each time we attempt to send, decrement our per-device retry counter.
On every successful send, we reset the counter. 

We delay 50us between attempts with up to 20000 retries for a total of
1 second. After we've exhausted our retries, subsequent failed
attempts will try only once until reset by success.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-11 19:25:54 -07:00
Alexey Dobriyan
a0d3bea3cf [NET]: Make skb->protocol __be16
There are many instances of

	skb->protocol = htons(ETH_P_*);
	skb->protocol = __constant_htons(ETH_P_*);
and
	skb->protocol = *_type_trans(...);

Most of *_type_trans() are already endian-annotated, so, let's shift
attention on other warnings.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-11 16:05:50 -07:00
Douglas Gilbert
972dcafb6d [libata scsi] add START STOP UNIT translation 2005-08-11 03:35:53 -04:00
Jeff Garzik
2bf69b5fe9 phy subsystem: more cleanups
- unexport symbols never used outside of home module
- remove dead code
- remove CONFIG_PHYCONTROL, make it unconditionally enabled
2005-08-11 02:47:54 -04:00
Jeff Garzik
67c4f3fa25 Fix numerous minor problems with new phy subsystem.
Includes fixes for problems noted by Adrian Bunk, Andrew Morton,
and one other person lost in the annals of history (and email folders).
2005-08-11 02:07:25 -04:00
Len Brown
95f193aa4f Merge ../to-linus 2005-08-11 00:56:08 -04:00
Jeff Garzik
cd04b947bc Merge /spare/repo/linux-2.6/ 2005-08-11 00:07:03 -04:00
Jeff Garzik
a7144b23da Merge /spare/repo/linux-2.6/ 2005-08-10 13:43:09 -04:00
Christoph Lameter
86b3786078 [PATCH] Fix ide-disk.c oops caused by hwif == NULL
1. Move hwif_to_node to ide.h

2. Use hwif_to_node in ide-disk.c

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-09 20:21:31 -07:00
David Woodhouse
c973b112c7 Merge with /shiny/git/linux-2.6/.git 2005-08-09 16:51:35 +01:00
John McCutchan
00dd1e4339 [PATCH] fsnotify-cleanups
This removes the now unused fsnotify_unlink & fsnotify_rmdir code.
Compile tested.

Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-08 19:22:42 -07:00
Linus Torvalds
dc836b5b6f Revert "[PATCH] PCI: restore BAR values..."
Revert commit fec59a711e, which is
breaking sparc64 that doesn't have a working pci_update_resource.

We'll re-do this after 2.6.13 when we'll do it all properly.
2005-08-08 18:46:09 -07:00
Linus Torvalds
92e52b2e82 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2005-08-08 16:06:01 -07:00
David S. Miller
4d479e40e1 [NETLINK]: Allocate and kill some netlink numbers.
NETLINK_ARPD is unused, allocate it to the Open-iSCSI folks.

NETLINK_ROUTE6 and NETLINK_TAPBASE are no longer used, delete
them.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-08 13:48:02 -07:00
John McCutchan
7a91bf7f5c [PATCH] fsnotify_name/inoderemove
The patch below unhooks fsnotify from vfs_unlink & vfs_rmdir.  It
introduces two new fsnotify calls, that are hooked in at the dcache
level.  This not only more closely matches how the VFS layer works, it
also avoids the problem with locking and inode lifetimes.

The two functions are

 - fsnotify_nameremove -- called when a directory entry is going away.
   It notifies the PARENT of the deletion.  This is called from
   d_delete().

 - inoderemove -- called when the files inode itself is going away.  It
   notifies the inode that is being deleted.  This is called from
   dentry_iput().

Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-08 11:53:47 -07:00
Olaf Hering
9ae5b3c703 [PATCH] remove linux/pagemap.h from linux/swap.h
sparc can not include linux/pagemap.h because of the following circular
dependency:

asm-sparc/pgtable include linux/swap.h
linux/swap.h include now linux/pagemap.h
linux/pagemap.h include linux/mm.h
linux/mm.h include asm/pgtable.h

It needs to have the swp_entry_t type fully visible in pgtable.h,
we can't work around this using macros.

Signed-off-by: Olaf Hering <olh@suse.de>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-07 10:00:38 -07:00
Linus Torvalds
243393c90f Add fakey 'deflateBound()' function to the in-kernel zlib routines
It's not the real deflateBound() in newer zlib libraries, partly because
the upcoming usage of it won't have the "stream" available, so we can't
have the same interfaces anyway.
2005-08-06 09:39:57 -07:00
Tejun Heo
ba02508248 [PATCH] blk: fix tag shrinking (revive real_max_size)
My patch in commit fa72b903f7 incorrectly
removed blk_queue_tag->real_max_depth.

The original resize implementation was incorrect in the following
points.

 * actual allocation size of tag_index was shorter than real_max_size,
   but assumed to be of the same size, possibly causing memory access
   beyond the allocated area.
 * bits in tag_map between max_deptn and real_max_depth were
   initialized to 1's, making the tags permanently reserved.

In an attempt to fix above two bugs, I had removed allocation optimization
in init_tag_map and real_max_size.  Tag map/index were allocated and freed
immediately during resize.

Unfortunately, I wasn't considering that tag map/index can be resized
dynamically with tags beyond new_depth active.  This led to accessing
freed area after shrinking tags and led to the following bug reporting
thread on linux-scsi.

   http://marc.theaimsgroup.com/?l=linux-scsi&m=112319898111885&w=2

To fix the problem, I've revived real_max_depth without allocation
optimization in init_tag_map, and Andrew Vasquez confirmed that the
problem was fixed.  As Jens is not going to be available for a week, he
asked me to make sure that this patch reaches you.

   http://marc.theaimsgroup.com/?l=linux-scsi&m=112325778530886&w=2

Also, a comment was added to make sure that real_max_size is needed for
dynamic shrinking.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-05 13:43:16 -07:00
Len Brown
e872d4cace Merge ../from-linus 2005-08-05 13:03:06 -04:00
John McCutchan
0c3dba1534 [PATCH] Clean up inotify delete race fix
This avoids the whole #ifdef mess by just getting a copy of
dentry->d_inode before d_delete is called - that makes the codepaths the
same for the INOTIFY/DNOTIFY cases as for the regular no-notify case.
I've been running this under a Gnome session for the last 10 minutes.
Inotify is being used extensively.

Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-04 21:37:39 -07:00
John W. Linville
fec59a711e [PATCH] PCI: restore BAR values after D3hot->D0 for devices that need it
Some PCI devices (e.g. 3c905B, 3c556B) lose all configuration
(including BARs) when transitioning from D3hot->D0.  This leaves such
a device in an inaccessible state.  The patch below causes the BARs
to be restored when enabling such a device, so that its driver will
be able to access it.

The patch also adds pci_restore_bars as a new global symbol, and adds a
correpsonding EXPORT_SYMBOL_GPL for that.

Some firmware (e.g. Thinkpad T21) leaves devices in D3hot after a
(re)boot.  Most drivers call pci_enable_device very early, so devices
left in D3hot that lose configuration during the D3hot->D0 transition
will be inaccessible to their drivers.

Drivers could be modified to account for this, but it would
be difficult to know which drivers need modification.  This is
especially true since often many devices are covered by the same
driver.  It likely would be necessary to replicate code across dozens
of drivers.

The patch below should trigger only when transitioning from D3hot->D0
(or at boot), and only for devices that have the "no soft reset" bit
cleared in the PM control register.  I believe it is safe to include
this patch as part of the PCI infrastructure.

The cleanest implementation of pci_restore_bars was to call
pci_update_resource.  Unfortunately, that does not currently exist
for the sparc64 architecture.  The patch below includes a null
implemenation of pci_update_resource for sparc64.

Some have expressed interest in making general use of the the
pci_restore_bars function, so that has been exported to GPL licensed
modules.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-04 21:32:46 -07:00
Len Brown
1d492eb413 [ACPI] Merge acpi-2.6.12 branch into 2.6.13-rc3
Signed-off-by: Len Brown <len.brown@intel.com>
2005-08-05 00:31:42 -04:00
Andrew Morton
53de49f52e [ACPI] CONFIG_ACPI=n build fix
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2005-08-04 22:27:11 -04:00
Kenji Kaneshige
1f3a6a1577 [ACPI] acpi_register_gsi() can return error
Current acpi_register_gsi() function has no way to indicate errors to its
callers even though acpi_register_gsi() can fail to register gsi because of
some reasons (out of memory, lack of interrupt vectors, incorrect BIOS, and so
on).  As a result, caller of acpi_register_gsi() cannot handle the case that
acpi_register_gsi() fails.  I think failure of acpi_register_gsi() should be
handled properly.

This series of patches changes acpi_register_gsi() to return negative value on
error, and also changes callers of acpi_register_gsi() to handle failure of
acpi_register_gsi().

This patch changes the type of return value of acpi_register_gsi() from
"unsigned int" to "int" to indicate an error.  If acpi_register_gsi() fails to
register gsi, it returns negative value.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2005-08-04 22:12:08 -04:00
NeilBrown
6b8b3e8a8b [PATCH] md: make sure md bitmap updates are flushed when array is stopped.
The recent change to never ignore the bitmap, revealed that the bitmap isn't
begin flushed properly when an array is stopped.

We call bitmap_daemon_work three times as there is a three-stage pipeline for
flushing updates to the bitmap file.

Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-04 13:00:54 -07:00
Nick Piggin
f33ea7f404 [PATCH] fix get_user_pages bug
Checking pte_dirty instead of pte_write in __follow_page is problematic
for s390, and for copy_one_pte which leaves dirty when clearing write.

So revert __follow_page to check pte_write as before, and make
do_wp_page pass back a special extra VM_FAULT_WRITE bit to say it has
done its full job: once get_user_pages receives this value, it no longer
requires pte_write in __follow_page.

But most callers of handle_mm_fault, in the various architectures, have
switch statements which do not expect this new case.  To avoid changing
them all in a hurry, make an inline wrapper function (using the old
name) that masks off the new bit, and use the extended interface with
double underscores.

Yes, we do have a call to do_wp_page from do_swap_page, but no need to
change that: in rare case it's needed, another do_wp_page will follow.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
[ Cleanups by Nick Piggin ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-03 09:12:05 -07:00
Adrian Bunk
0072b1389c [PATCH] include/linux/dcookies.h: dummy functions must be "static inline"
We don't want these to be global functions.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-01 21:37:59 -07:00
John McCutchan
7544953685 [PATCH] inotify: fix file deletion by rename detection
When a file is moved over an existing file that you are watching,
inotify won't send you a DELETE_SELF event and it won't unref the inode
until the inotify instance is closed by the application.

Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-01 09:16:53 -07:00
Jeff Garzik
8a60a07129 libata: trim trailing whitespace.
Also, fixup a tabs-to-spaces block of code in ata_piix.
2005-07-31 13:13:24 -04:00
Daniel Drake
541134cfe7 [PATCH] sata_nv: Support MCP51/MCP55 device IDs
This is a multi-part message in MIME format.
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-07-31 01:04:43 -04:00
Andy Fleming
00db8189d9 This patch adds a PHY Abstraction Layer to the Linux Kernel, enabling
ethernet drivers to remain as ignorant as is reasonable of the connected
PHY's design and operation details.

Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-07-30 19:31:23 -04:00
Jeff Garzik
a670fcb43f /spare/repo/netdev-2.6 branch 'master' 2005-07-30 18:14:15 -04:00
Len Brown
adbedd3424 merge 2.6.13-rc4 with ACPI's to-linus tree 2005-07-30 01:55:32 -04:00
Len Brown
d6ac1a7910 /home/lenb/src/to-linus branch 'acpi-2.6.12' 2005-07-29 23:31:17 -04:00
David Shaohua Li
87bec66b96 [ACPI] suspend/resume ACPI PCI Interrupt Links
Add reference count and disable ACPI PCI Interrupt Link
when no device still uses it.

Warn when drivers have not released Link at suspend time.

http://bugzilla.kernel.org/show_bug.cgi?id=3469

Signed-off-by: David Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2005-07-29 22:49:38 -04:00
Kumar Gala
a46e812620 [PATCH] PCI: fix up errors after dma bursting patch and CONFIG_PCI=n -- bug?
In the patch from:

http://www.uwsg.iu.edu/hypermail/linux/kernel/0506.3/0985.html

Is the the following line suppose inside the if CONFIG_PCI=n

  #define pci_dma_burst_advice(pdev, strat, strategy_parameter) do { } while (0)

Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-29 13:12:52 -07:00
Linus Torvalds
e0d7ff168a Merge master.kernel.org:/pub/scm/linux/kernel/git/dtor/input 2005-07-29 09:48:34 -07:00
Linus Torvalds
2ac6608c41 Revert broken "statement with no effect" warning fix
It may shut up gcc, but it also incorrectly changes the semantics of the
smp_call_function() helpers.

You can fix the warning other ways if you are interested (create another
inline function that takes no arguments and returns zero), but
preferably gcc just shouldn't complain about unused return values from
statement expressions in the first place.
2005-07-28 10:34:47 -07:00
Richard Henderson
79a8810221 [PATCH] alpha: fix "statement with no effect" warnings
Apparently gcc 4.0 complains about "({ 0; });", which leads to -Werror
breakage in one of the alpha oprofile modules.

One might could argue that this is a gcc bug, in that statement-expressions
should be considered to be function-like rather than statement-like for the
purposes of this warning.  But it's just as easy to use an inline function
in the first place, side-stepping the issue.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-28 08:39:02 -07:00
Russell King
661299d9d0 Merge with Linus' 2.6 tree 2005-07-28 09:30:20 +01:00
Ralf Baechle
e5c2d74917 [PATCH] serial_core whitespace fix
Use tabs for formatting like anywhere else in this file.

Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:26:09 -07:00
Olaf Hering
44456d37b5 [PATCH] turn many #if $undefined_string into #ifdef $undefined_string
turn many #if $undefined_string into #ifdef $undefined_string to fix some
warnings after -Wno-def was added to global CFLAGS

Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:26:08 -07:00
Andreas Gruenbacher
8c52ab42c1 [PATCH] mbcache: Remove unused mb_cache_shrink parameter
The cache parameter to mb_cache_shrink isn't used.  We may as well remove
it.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:26:07 -07:00
Peter Staubach
c293621bbf [PATCH] stale POSIX lock handling
I believe that there is a problem with the handling of POSIX locks, which
the attached patch should address.

The problem appears to be a race between fcntl(2) and close(2).  A
multithreaded application could close a file descriptor at the same time as
it is trying to acquire a lock using the same file descriptor.  I would
suggest that that multithreaded application is not providing the proper
synchronization for itself, but the OS should still behave correctly.

SUS3 (Single UNIX Specification Version 3, read: POSIX) indicates that when
a file descriptor is closed, that all POSIX locks on the file, owned by the
process which closed the file descriptor, should be released.

The trick here is when those locks are released.  The current code releases
all locks which exist when close is processing, but any locks in progress
are handled when the last reference to the open file is released.

There are three cases to consider.

One is the simple case, a multithreaded (mt) process has a file open and
races to close it and acquire a lock on it.  In this case, the close will
release one reference to the open file and when the fcntl is done, it will
release the other reference.  For this situation, no locks should exist on
the file when both the close and fcntl operations are done.  The current
system will handle this case because the last reference to the open file is
being released.

The second case is when the mt process has dup(2)'d the file descriptor.
The close will release one reference to the file and the fcntl, when done,
will release another, but there will still be at least one more reference
to the open file.  One could argue that the existence of a lock on the file
after the close has completed is okay, because it was acquired after the
close operation and there is still a way for the application to release the
lock on the file, using an existing file descriptor.

The third case is when the mt process has forked, after opening the file
and either before or after becoming an mt process.  In this case, each
process would hold a reference to the open file.  For each process, this
degenerates to first case above.  However, the lock continues to exist
until both processes have released their references to the open file.  This
lock could block other lock requests.

The changes to release the lock when the last reference to the open file
aren't quite right because they would allow the lock to exist as long as
there was a reference to the open file.  This is too long.

The new proposed solution is to add support in the fcntl code path to
detect a race with close and then to release the lock which was just
acquired when such as race is detected.  This causes locks to be released
in a timely fashion and for the system to conform to the POSIX semantic
specification.

This was tested by instrumenting a kernel to detect the handling locks and
then running a program which generates case #3 above.  A dangling lock
could be reliably generated.  When the changes to detect the close/fcntl
race were added, a dangling lock could no longer be generated.

Cc: Matthew Wilcox <willy@debian.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:26:06 -07:00
Martin Schwidefsky
951f22d5b1 [PATCH] s390: spin lock retry
Split spin lock and r/w lock implementation into a single try which is done
inline and an out of line function that repeatedly tries to get the lock
before doing the cpu_relax().  Add a system control to set the number of
retries before a cpu is yielded.

The reason for the spin lock retry is that the diagnose 0x44 that is used to
give up the virtual cpu is quite expensive.  For spin locks that are held only
for a short period of time the costs of the diagnoses outweights the savings
for spin locks that are held for a longer timer.  The default retry count is
1000.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:26:04 -07:00
Andrey Panin
4bfdf37830 [PATCH] consolidate CONFIG_WATCHDOG_NOWAYOUT handling
Attached patch removes #ifdef CONFIG_WATCHDOG_NOWAYOUT mess duplicated in
almost every watchdog driver and replaces it with common define in
linux/watchdog.h.

Signed-off-by: Andrey Panin <pazke@donpac.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:25:54 -07:00
Olivier Blin
49f2991585 [PATCH] i4l: add Olitec ISDN PCI card in hisax gazel driver
This patch adds support for the Olitec ISDN PCI card in the hisax gazel
driver.  The gazel driver supports this card, but wasn't aware of its PCI
ids.  Users used to modify the PCI ids of a supported card in
include/linux/pci_ids.h and recompile their kernel to get this card
running, as said in most Howtos.  This patch makes the hisax gazel driver
recognize the PCI ids of the Olitec ISDN PCI card.

Signed-off-by: Olivier Blin <oblin@mandriva.com>
Signed-off-by: Karsten Keil <kkeil@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:25:51 -07:00
Alexey Dobriyan
c10b873695 [PATCH] Really __nocast-annotate kmalloc_node()
One chunk was lost somewhere between my and Andrew's machine.

Noticed by Victor Fusco.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:25:47 -07:00
David Woodhouse
c5fbc3966f Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2005-07-27 14:14:13 +01:00
Russell King
05caac585f [SERIAL] Convert parport_serial to use new 8250_pci interfaces
Convert parport_serial to use the new 8250_pci interface, converting
the table to a pciserial_board table.  This also unuses the SPCI_*
definitions in serialP.h, which can now be removed.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-07-27 11:41:18 +01:00
Russell King
241fc4367b [SERIAL] Expose 8250_pci setup/removal/suspend/resume functions
Re-jig the setup/removal/suspend/resume of 8250 pci ports so that they
know slightly less about how they're attached to a PCI device.  Expose
this as the new interface for registering PCI serial ports, as well as
the pciserial_board structure and associated flag definitions.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-07-27 11:35:54 +01:00
Linus Torvalds
4d7de66e2c Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2005-07-26 16:43:39 -07:00
Adrian Bunk
cadf01c2fc [NETFILTER]: Fix ip_conntrack_put() prototype.
The function is not inline.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-26 15:39:28 -07:00
Eric W. Biederman
7c9034735e [PATCH] Add emergency_restart()
When the kernel is working well and we want to restart cleanly
kernel_restart is the function to use.   But in many instances
the kernel wants to reboot when thing are expected to be working
very badly such as from panic or a software watchdog handler.

This patch adds the function emergency_restart() so that
callers can be clear what semantics they expect when calling
restart.  emergency_restart() is expected to be callable
from interrupt context and possibly reliable in even more
trying circumstances.

This is an initial generic implementation for all architectures.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 14:35:41 -07:00
Eric W. Biederman
4a00ea1e18 [PATCH] Refactor sys_reboot into reusable parts
Because the factors of sys_reboot don't exist people calling
into the reboot path duplicate the code badly, leading to
inconsistent expectations of code in the reboot path.

This patch should is just code motion.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 14:35:41 -07:00