Commit Graph

19512 Commits

Author SHA1 Message Date
Zhang Rui
207339398e ACPI: attach thermal zone info
Intel menlow driver needs to get the pointer of themal_zone_device
structure of an ACPI thermal zone.
Attach this to each ACPI thermal zone device object.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Thomas Sujith <sujith.thomas@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-02-01 23:20:20 -05:00
Zhang Rui
d9460fd227 ACPI: register ACPI Processor as generic thermal cooling device
Register ACPI processor as thermal cooling devices.
A combination of processor T-state and P-state are used for thermal throttling.
the processor will reduce the frequency first and then set the T-state.

we use cpufreq_thermal_reduction_pctg to calculate the cpufreq limit,
and call cpufreq_verify_with_limit to set the cpufreq limit.
if cpufreq driver is loaded, then we have four cooling state for cpufreq control.
cooling state 0: cpufreq limit == max_freq
cooling state 1: cpufreq limit == max_freq * 80%
cooling state 2: cpufreq limit == max_freq * 60%
cooling state 3: cpufreq limit == max_freq * 40%

after the cpufreq limit is set to 40 percentage of the max_freq,
we use T-state for cooling.

eg. a processor has P-state support, and it has 8 T-state (T0-T7),
the max_state of the proceesor is 10:

state	cpufreq-limit  T-state
0:	max_freq	T0
1:	max_freq * 80%	T0
2:	max_freq * 60%	T0
3:	max_freq * 40%	T0
4:	max_freq * 40%	T1
5:	max_freq * 40%	T2
6:	max_freq * 40%	T3
7:	max_freq * 40%	T4
8:	max_freq * 40%	T5
9:	max_freq * 40%	T6
10:	max_freq * 40%	T7

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Thomas Sujith <sujith.thomas@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-02-01 23:18:19 -05:00
Zhang Rui
203d3d4aa4 the generic thermal sysfs driver
The Generic Thermal sysfs driver for thermal management.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Thomas Sujith <sujith.thomas@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-02-01 23:12:19 -05:00
Linus Torvalds
f3191248bf Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (100 commits)
  ide: move hwif_register() call out of ide_probe_port()
  ide: factor out code for tuning devices from ide_probe_port()
  ide: move handling of I/O resources out of ide_probe_port()
  ide: make probe_hwif() return an error value
  ide: use ide_remove_port_from_hwgroup in init_irq()
  ide: prepare init_irq() for using ide_remove_port_from_hwgroup()
  ide: factor out code removing port from hwgroup from ide_unregister()
  ide: I/O resources are released too early in ide_unregister()
  ide: cleanup ide_system_bus_speed()
  ide: remove needless zeroing of hwgroup fields from init_irq()
  ide: remove unused ide_hwgroup_t fields
  ide_platform: remove struct hwif_prop
  ide: remove hwif->present manipulations from hwif_init()
  ide: move wait_hwif_ready() documentation in the right place
  ide: fix handling of busy I/O resources in probe_hwif()
  <linux/hdsmart.h> is not used by kernel code
  ide: don't include <linux/hdsmart.h>
  ide-floppy: cleanup header
  ide: update/add my Copyrights
  ide: delete filenames/versions from comments
  ...
2008-02-02 09:58:02 +11:00
Bartlomiej Zolnierkiewicz
fbd130887a ide: use ide_remove_port_from_hwgroup in init_irq()
There should be no functionality changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-01 23:09:36 +01:00
Bartlomiej Zolnierkiewicz
a6fbb1c8c3 ide: remove unused ide_hwgroup_t fields
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-01 23:09:35 +01:00
Bartlomiej Zolnierkiewicz
76166952bb <linux/hdsmart.h> is not used by kernel code
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-01 23:09:34 +01:00
Bartlomiej Zolnierkiewicz
dac2242047 ide: don't include <linux/hdsmart.h>
IDE doesn't need it.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-01 23:09:34 +01:00
Bartlomiej Zolnierkiewicz
062f9f024d ide: use ide_build_sglist() and ide_destroy_dmatable() in non-PCI host drivers
* Make ide_build_sglist() and ide_destroy_dmatable() available also when
  CONFIG_BLK_DEV_IDEDMA_PCI=n.

* Use ide_build_sglist() and ide_destroy_dmatable() in {ics,au1xxx-}ide.c
  and remove no longer needed {ics,au}ide_build_sglist().

There should be no functionality changes caused by this patch.

Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-01 23:09:32 +01:00
Bartlomiej Zolnierkiewicz
5df37c34a3 au1xxx-ide: use hwif->dev
* Setup hwif->dev in au_ide_probe().

* Use hwif->dev instead of ahwif->dev in auide_build_sglist(),
  auide_build_dmatable(), auide_dma_end() and auide_ddma_init().

* Remove no longer needed 'dev' field from _auide_hwif type.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-01 23:09:31 +01:00
Bartlomiej Zolnierkiewicz
36501650ec ide: keep pointer to struct device instead of struct pci_dev in ide_hwif_t
Keep pointer to struct device instead of struct pci_dev in ide_hwif_t.

While on it:
* Use *dev->dma_mask instead of pci_dev->dma_mask in ide_toggle_bounce().

There should be no functionality changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-01 23:09:31 +01:00
Bartlomiej Zolnierkiewicz
4166c1993b ide: add IDE_HFLAG_NO_DSC host flag
* Add IDE_HFLAG_NO_DSC host flag for hosts that doesn't support DSC overlap.

* Set it in aec62xx (for ATP850UF only) and hpt34x host drivers.

* Convert ide-tape device driver to check for IDE_HFLAG_NO_DSC flag.

Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-01 23:09:30 +01:00
Bartlomiej Zolnierkiewicz
8ac2b42a45 ide: add IDE_HFLAG_CLEAR_SIMPLEX host flag
* Rename 'simplex_stat' variable to 'dma_stat' in ide_get_or_set_dma_base().

* Factor out code for forcing host out of "simplex" mode from
  ide_get_or_set_dma_base() to ide_pci_clear_simplex() helper.

* Add IDE_HFLAG_CLEAR_SIMPLEX host flag and set it in alim15x3 (for M5229),
  amd74xx (for AMD 7409), cmd64x (for CMD643), generic (for Netcell) and
  serverworks (for CSB5) host drivers.

* Make ide_get_or_set_dma_base() test for IDE_HFLAG_CLEAR_SIMPLEX host flag
  instead of checking dev->device (BTW the code was buggy because it didn't
  check for dev->vendor, luckily none of these PCI Device IDs was used by
  some other vendor for PCI IDE controller).

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-01 23:09:30 +01:00
Sergei Shtylyov
ecf3279639 ide: ide_setup_dma() assumes 8 ports
According to http://marc.info/?l=linux-ide&m=114346138611631, the drivers must
always register 8 DMA ports with ide_setup_dma(), so its last argument is not
needed. While at it, kill some useless parens in that function...

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-01 23:09:30 +01:00
Bartlomiej Zolnierkiewicz
7b9f25b539 ide: add ide_dump_identify() debug helper
* Add ide_dump_identify() debug helper for dumping raw identify data in
  the hdparm friendly format (== the identify data can be extracted from
  dmesg output and passed to hdparm --Istdin).

* Dump identify data in ide-probe.c::do_identify() if DEBUG is enabled.

Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-01 23:09:28 +01:00
Bartlomiej Zolnierkiewicz
a1bb9457f0 ide-cd: move lba_to_msf() and msf_to_lba() to <linux/cdrom.h>
* Move lba_to_msf() and msf_to_lba() to <linux/cdrom.h>
  (use 'u8' type instead of 'byte' while at it).

* Remove msf_to_lba() copy from drivers/cdrom/cdrom.c.

Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-01 23:09:24 +01:00
Adrian Bunk
da6f4c7f6f ide: make wait_drive_not_busy() static again
After commit 7267c33774 
wait_drive_not_busy() can become static again.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-01 23:09:16 +01:00
Adrian Bunk
2eae6ebbf9 ide: small ide-scan-pci.c cleanup
- ide_scan_pcibus() can become static
- instead of ide_scan_pci() we can use ide_scan_pcibus() directly
  in module_init()

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-01 23:09:16 +01:00
Linus Torvalds
dd5f5fed6c Merge branch 'audit.b46' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b46' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
  [AUDIT] Add uid, gid fields to ANOM_PROMISCUOUS message
  [AUDIT] ratelimit printk messages audit
  [patch 2/2] audit: complement va_copy with va_end()
  [patch 1/2] kernel/audit.c: warning fix
  [AUDIT] create context if auditing was ever enabled
  [AUDIT] clean up audit_receive_msg()
  [AUDIT] make audit=0 really stop audit messages
  [AUDIT] break large execve argument logging into smaller messages
  [AUDIT] include audit type in audit message when using printk
  [AUDIT] do not panic on exclude messages in audit_log_pid_context()
  [AUDIT] Add End of Event record
  [AUDIT] add session id to audit messages
  [AUDIT] collect uid, loginuid, and comm in OBJ_PID records
  [AUDIT] return EINTR not ERESTART*
  [PATCH] get rid of loginuid races
  [PATCH] switch audit_get_loginuid() to task_struct *
2008-02-02 08:37:03 +11:00
Linus Torvalds
3e01dfce13 Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
  x86: avoid section mismatch involving arch_register_cpu
  x86: fixes for lookup_address args
  x86: fix sparse warnings in cpu/common.c
  x86: make early_console static in early_printk.c
  x86: remove unneeded round_up
  x86: fix section mismatch warning in kernel/pci-calgary
  x86: fix section mismatch warning in acpi/boot.c
  x86: fix section mismatch warnings when referencing notifiers
  x86: silence section mismatch warning in smpboot_64.c
  x86: fix comments in vmlinux_64.lds
  x86_64: make bootmap_start page align v6
  x86_64: add debug name for early_res
2008-02-02 08:27:50 +11:00
Eric Paris
de6bbd1d30 [AUDIT] break large execve argument logging into smaller messages
execve arguments can be quite large.  There is no limit on the number of
arguments and a 4G limit on the size of an argument.

this patch prints those aruguments in bite sized pieces.  a userspace size
limitation of 8k was discovered so this keeps messages around 7.5k

single arguments larger than 7.5k in length are split into multiple records
and can be identified as aX[Y]=

Signed-off-by: Eric Paris <eparis@redhat.com>
2008-02-01 14:23:55 -05:00
Eric Paris
c0641f28dc [AUDIT] Add End of Event record
This patch adds an end of event record type. It will be sent by the kernel as
the last record when a multi-record event is triggered. This will aid realtime
analysis programs since they will now reliably know they have the last record
to complete an event. The audit daemon filters this and will not write it to
disk.

Signed-off-by: Steve Grubb <sgrubb redhat com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2008-02-01 14:07:19 -05:00
Eric Paris
4746ec5b01 [AUDIT] add session id to audit messages
In order to correlate audit records to an individual login add a session
id.  This is incremented every time a user logs in and is included in
almost all messages which currently output the auid.  The field is
labeled ses=  or oses=

Signed-off-by: Eric Paris <eparis@redhat.com>
2008-02-01 14:06:51 -05:00
Al Viro
bfef93a5d1 [PATCH] get rid of loginuid races
Keeping loginuid in audit_context is racy and results in messier
code.  Taken to task_struct, out of the way of ->audit_context
changes.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-02-01 14:05:28 -05:00
Al Viro
0c11b9428f [PATCH] switch audit_get_loginuid() to task_struct *
all callers pass something->audit_context

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-02-01 14:04:59 -05:00
Alexander van Heukelum
d987402695 x86: avoid section mismatch involving arch_register_cpu
Avoid section mismatch involving arch_register_cpu.

Marking arch_register_cpu as __init and removing the export
for non-hotplug-cpu configurations makes the following warning
go away:

Section mismatch in reference from the function
arch_register_cpu() to the function .devinit.text:register_cpu()
The function  arch_register_cpu() references
the function __devinit register_cpu().
This is often because arch_register_cpu lacks a __devinit
annotation or the annotation of register_cpu is wrong.

The only external user of arch_register_cpu in the tree is
in drivers/acpi/processor_core.c where it is guarded by
ACPI_HOTPLUG_CPU (which depends on HOTPLUG_CPU).

Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
CC: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-01 17:49:43 +01:00
Yinghai Lu
24a5da73f4 x86_64: make bootmap_start page align v6
boot oopses when a system has 64 or 128 GB of RAM installed:

Calling initcall 0xffffffff80bc33b6: sctp_init+0x0/0x711()
BUG: unable to handle kernel NULL pointer dereference at 000000000000005f
IP: [<ffffffff802bfe55>] proc_register+0xe7/0x10f
PGD 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.24-smp-g5a514e21-dirty #6
RIP: 0010:[<ffffffff802bfe55>]  [<ffffffff802bfe55>] proc_register+0xe7/0x10f
RSP: 0000:ffff810824c57e60  EFLAGS: 00010246
RAX: 000000000000d7d7 RBX: ffff811024c5fa80 RCX: ffff810824c57e08
RDX: 0000000000000000 RSI: 0000000000000195 RDI: ffffffff80cc2460
RBP: ffffffffffffffff R08: 0000000000000000 R09: ffff811024c5fa80
R10: 0000000000000000 R11: 0000000000000002 R12: ffff810824c57e6c
R13: 0000000000000000 R14: ffff810824c57ee0 R15: 00000006abd25bee
FS:  0000000000000000(0000) GS:ffffffff80b4d000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 000000000000005f CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff810824c56000, task ffff812024c52000)
Stack:  ffffffff80a57348 0000019500000000 ffff811024c5fa80 0000000000000000
 00000000ffffff97 ffffffff802bfef0 0000000000000000 ffffffffffffffff
 0000000000000000 ffffffff80bc3b4b ffff810824c57ee0 ffffffff80bc34a5
Call Trace:
 [<ffffffff802bfef0>] ? create_proc_entry+0x73/0x8a
 [<ffffffff80bc3b4b>] ? sctp_snmp_proc_init+0x1c/0x34
 [<ffffffff80bc34a5>] ? sctp_init+0xef/0x711
 [<ffffffff80b976e3>] ? kernel_init+0x175/0x2e1
 [<ffffffff8020ccf8>] ? child_rip+0xa/0x12
 [<ffffffff80b9756e>] ? kernel_init+0x0/0x2e1
 [<ffffffff8020ccee>] ? child_rip+0x0/0x12

Code: 1e 48 83 7b 38 00 75 08 48 c7 43 38 f0 e8 82 80 48 83 7b 30 00 75 08 48 c7 43 30 d0 e9 82 80 48 c7 c7 60 24 cc 80 e8 bd 5a 54 00 <48> 8b 45 60 48 89 6b 58 48 89 5d 60 48 89 43 50 fe 05 f5 25 a0
RIP  [<ffffffff802bfe55>] proc_register+0xe7/0x10f
 RSP <ffff810824c57e60>
CR2: 000000000000005f
---[ end trace 02c2d78def82877a ]---
Kernel panic - not syncing: Attempted to kill init!

it turns out some variables near end of bss are corrupted already.

in System.map we have
ffffffff80d40420 b rsi_table
ffffffff80d40620 B krb5_seq_lock
ffffffff80d40628 b i.20437
ffffffff80d40630 b xprt_rdma_inline_write_padding
ffffffff80d40638 b sunrpc_table_header
ffffffff80d40640 b zero
ffffffff80d40644 b min_memreg
ffffffff80d40648 b rpcrdma_tk_lock_g
ffffffff80d40650 B sctp_assocs_id_lock
ffffffff80d40658 B proc_net_sctp
ffffffff80d40660 B sctp_assocs_id
ffffffff80d40680 B sysctl_sctp_mem
ffffffff80d40690 B sysctl_sctp_rmem
ffffffff80d406a0 B sysctl_sctp_wmem
ffffffff80d406b0 b sctp_ctl_socket
ffffffff80d406b8 b sctp_pf_inet6_specific
ffffffff80d406c0 b sctp_pf_inet_specific
ffffffff80d406c8 b sctp_af_v4_specific
ffffffff80d406d0 b sctp_af_v6_specific
ffffffff80d406d8 b sctp_rand.33270
ffffffff80d406dc b sctp_memory_pressure
ffffffff80d406e0 b sctp_sockets_allocated
ffffffff80d406e4 b sctp_memory_allocated
ffffffff80d406e8 b sctp_sysctl_header
ffffffff80d406f0 b zero
ffffffff80d406f4 A __bss_stop
ffffffff80d406f4 A _end

and setup_node_bootmem() will use that page 0xd40000 for bootmap
Bootmem setup node 0 0000000000000000-0000000828000000
  NODE_DATA [000000000008a485 - 0000000000091484]
  bootmap [0000000000d406f4 -  0000000000e456f3] pages 105
Bootmem setup node 1 0000000828000000-0000001028000000
  NODE_DATA [0000000828000000 - 0000000828006fff]
  bootmap [0000000828007000 -  0000000828106fff] pages 100
Bootmem setup node 2 0000001028000000-0000001828000000
  NODE_DATA [0000001028000000 - 0000001028006fff]
  bootmap [0000001028007000 -  0000001028106fff] pages 100
Bootmem setup node 3 0000001828000000-0000002028000000
  NODE_DATA [0000001828000000 - 0000001828006fff]
  bootmap [0000001828007000 -  0000001828106fff] pages 100

setup_node_bootmem() makes NODE_DATA cacheline aligned,
and bootmap is page-aligned.

the patch updates find_e820_area() to make sure we can meet
the alignment constraints.

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-01 17:49:41 +01:00
Yinghai Lu
25eff8d4cd x86_64: add debug name for early_res
helps debugging problems in this rather murky area of code.

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-01 17:49:41 +01:00
Thomas Gleixner
cd689985cf futex: Add bitset conditional wait/wakeup functionality
To allow the implementation of optimized rw-locks in user space, glibc
needs a possibility to select waiters for wakeup depending on a bitset
mask.

This requires two new futex OPs: FUTEX_WAIT_BITS and FUTEX_WAKE_BITS
These OPs are basically the same as FUTEX_WAIT and FUTEX_WAKE plus an
additional argument - a bitset. Further the FUTEX_WAIT_BITS OP is
expecting an absolute timeout value instead of the relative one, which
is used for the FUTEX_WAIT OP.

FUTEX_WAIT_BITS calls into the kernel with a bitset. The bitset is
stored in the futex_q structure, which is used to enqueue the waiter
into the hashed futex waitqueue.

FUTEX_WAKE_BITS also calls into the kernel with a bitset. The wakeup
function logically ANDs the bitset with the bitset stored in each
waiters futex_q structure. If the result is zero (i.e. none of the set
bits in the bitsets is matching), then the waiter is not woken up. If
the result is not zero (i.e. one of the set bits in the bitsets is
matching), then the waiter is woken.

The bitset provided by the caller must be non zero. In case the
provided bitset is zero the kernel returns EINVAL.

Internaly the new OPs are only extensions to the existing FUTEX_WAIT
and FUTEX_WAKE functions. The existing OPs hand a bitset with all bits
set into the futex_wait() and futex_wake() functions.

Signed-off-by: Thomas Gleixner <tgxl@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-01 17:45:14 +01:00
Thomas Gleixner
9d55b9923a x86: replace LOCK_PREFIX in futex.h
The exception fixup for the futex macros __futex_atomic_op1/2 and
futex_atomic_cmpxchg_inatomic() is missing an entry when the lock
prefix is replaced by a NOP via SMP alternatives.

Chuck Ebert tracked this down from the information provided in:
https://bugzilla.redhat.com/show_bug.cgi?id=429412

A possible solution would be to add another fixup after the
LOCK_PREFIX, so both the LOCK and NOP case have their own entry in the
exception table, but it's not really worth the trouble.

Simply replace LOCK_PREFIX with lock and keep those untouched by SMP
alternatives.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-01 17:45:14 +01:00
Thomas Gleixner
5df7fa1c62 tick-sched: add more debug information
To allow better diagnosis of tick-sched related, especially NOHZ
related problems, we need to know when the last wakeup via an irq
happened and when the CPU left the idle state.

Add two fields (idle_waketime, idle_exittime) to the tick_sched
structure and add them to the timer_list output.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-01 17:45:14 +01:00
Thomas Gleixner
1001d0a9ee timekeeping: update xtime_cache when time(zone) changes
xtime_cache needs to be updated whenever xtime and or wall_to_monotic
are changed. Otherwise users of xtime_cache might see a stale (and in
the case of timezone changes utterly wrong) value until the next
update happens.

Fixup the obvious places, which miss this update.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: John Stultz <johnstul@us.ibm.com>
Tested-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-01 17:45:13 +01:00
Linus Torvalds
24e1c13c93 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: kill swap_io_context()
  as-iosched: fix inconsistent ioc->lock context
  ide-cd: fix leftover data BUG
  block: make elevator lib checkpatch compliant
  cfq-iosched: make checkpatch compliant
  block: make core bits checkpatch compliant
  block: new end request handling interface should take unsigned byte counts
  unexport add_disk_randomness
  block/sunvdc.c:print_version() must be __devinit
  splice: always updated atime in direct splice
2008-02-01 21:48:45 +11:00
Jens Axboe
3bc217ffe6 block: kill swap_io_context()
It blindly copies everything in the io_context, including the lock.
That doesn't work so well for either lock ordering or lockdep.

There seems zero point in swapping io contexts on a request to request
merge, so the best point of action is to just remove it.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-02-01 11:34:49 +01:00
Linus Torvalds
cec03afcb6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (173 commits)
  [NETNS]: Lookup in FIB semantic hashes taking into account the namespace.
  [NETNS]: Add a namespace mark to fib_info.
  [IPV4]: fib_sync_down rework.
  [NETNS]: Process interface address manipulation routines in the namespace.
  [IPV4]: Small style cleanup of the error path in rtm_to_ifaddr.
  [IPV4]: Fix memory leak on error path during FIB initialization.
  [NETFILTER]: Ipv6-related xt_hashlimit compilation fix.
  [NET_SCHED]: Add flow classifier
  [NET_SCHED]: sch_sfq: make internal queues visible as classes
  [NET_SCHED]: sch_sfq: add support for external classifiers
  [NET_SCHED]: Constify struct tcf_ext_map
  [BLUETOOTH]: Fix bugs in previous conn add/del workqueue changes.
  [TCP]: Unexport sysctl_tcp_tso_win_divisor
  [IPV4]: Make struct ipv4_devconf static.
  [TR] net/802/tr.c: sysctl_tr_rif_timeout static
  [XFRM]: Fix statistics.
  [XFRM]: Remove unused exports.
  [PKT_SCHED] sch_teql.c: Duplicate IFF_BROADCAST in FMASK, remove 2nd.
  [BNX2]: Fix ASYM PAUSE advertisement for remote PHY.
  [IPV4] route cache: Introduce rt_genid for smooth cache invalidation
  ...
2008-02-01 21:06:29 +11:00
Greg Ungerer
f6efaf62bb m68knommu: remove unused CONFIG_DISKtel symbol
Remove unused CONFIG_DISKtel define.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-01 21:00:01 +11:00
Greg Ungerer
b7dcf7fe7c m68knommu: fix 528x ColdFire cache settings
Fix problems with the 528x ColdFire CPU cache setup.
Do not cache the flash region (if present), and make the runtime
settings consistent with the init setting.

Problems pointed out by Bernd Buttner <b.buettner@mkc-gmbh.de>

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-01 21:00:01 +11:00
Jens Axboe
22b132102f block: new end request handling interface should take unsigned byte counts
No point in passing signed integers as the byte count, they can never
be negative.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-02-01 09:26:33 +01:00
Denis V. Lunev
4814bdbd59 [NETNS]: Lookup in FIB semantic hashes taking into account the namespace.
The namespace is not available in the fib_sync_down_addr, add it as a
parameter.

Looking up a device by the pointer to it is OK. Looking up using a
result from fib_trie/fib_hash table lookup is also safe. No need to
fix that at all.  So, just fix lookup by address and insertion to the
hash table path.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:41 -08:00
Denis V. Lunev
7462bd744e [NETNS]: Add a namespace mark to fib_info.
This is required to make fib_info lookups namespace aware. In the
other case initial namespace devices are marked as dead in the local
routing table during other namespace stop.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:40 -08:00
Denis V. Lunev
85326fa54b [IPV4]: fib_sync_down rework.
fib_sync_down can be called with an address and with a device. In
reality it is called either with address OR with a device. The
codepath inside is completely different, so lets separate it into two
calls for these two cases.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:39 -08:00
Patrick McHardy
e5dfb81518 [NET_SCHED]: Add flow classifier
Add new "flow" classifier, which is meant to extend the SFQ hashing
capabilities without hard-coding new hash functions and also allows
deterministic mappings of keys to classes, replacing some out of tree
iptables patches like IPCLASSIFY (maps IPs to classes), IPMARK (maps
IPs to marks, with fw filters to classes), ...

Some examples:

- Classic SFQ hash:

  tc filter add ... flow hash \
  	keys src,dst,proto,proto-src,proto-dst divisor 1024

- Classic SFQ hash, but using information from conntrack to work properly in
  combination with NAT:

  tc filter add ... flow hash \
  	keys nfct-src,nfct-dst,proto,nfct-proto-src,nfct-proto-dst divisor 1024

- Map destination IPs of 192.168.0.0/24 to classids 1-257:

  tc filter add ... flow map \
  	key dst addend -192.168.0.0 divisor 256

- alternatively:

  tc filter add ... flow map \
  	key dst and 0xff

- similar, but reverse ordered:

  tc filter add ... flow map \
  	key dst and 0xff xor 0xff

Perturbation is currently not supported because we can't reliable kill the
timer on destruction.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:36 -08:00
Patrick McHardy
94de78d195 [NET_SCHED]: sch_sfq: make internal queues visible as classes
Add support for dumping statistics and make internal queues visible as
classes.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:35 -08:00
Patrick McHardy
5239008b0d [NET_SCHED]: Constify struct tcf_ext_map
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:34 -08:00
Adrian Bunk
0027ba8434 [IPV4]: Make struct ipv4_devconf static.
struct ipv4_devconf can now become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:31 -08:00
Masahide NAKAMURA
9472c9ef64 [XFRM]: Fix statistics.
o Outbound sequence number overflow error status
  is counted as XfrmOutStateSeqError.
o Additionaly, it changes inbound sequence number replay
  error name from XfrmInSeqOutOfWindow to XfrmInStateSeqError
  to apply name scheme above.
o Inbound IPv4 UDP encapsuling type mismatch error is wrongly
  mapped to XfrmInStateInvalid then this patch fiex the error
  to XfrmInStateMismatch.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:30 -08:00
Eric Dumazet
29e75252da [IPV4] route cache: Introduce rt_genid for smooth cache invalidation
Current ip route cache implementation is not suited to large caches.

We can consume a lot of CPU when cache must be invalidated, since we
currently need to evict all cache entries, and this eviction is
sometimes asynchronous. min_delay & max_delay can somewhat control this
asynchronism behavior, but whole thing is a kludge, regularly triggering
infamous soft lockup messages. When entries are still in use, this also
consumes a lot of ram, filling dst_garbage.list.

A better scheme is to use a generation identifier on each entry,
so that cache invalidation can be performed by changing the table
identifier, without having to scan all entries.
No more delayed flushing, no more stalling when secret_interval expires.

Invalidated entries will then be freed at GC time (controled by
ip_rt_gc_timeout or stress), or when an invalidated entry is found
in a chain when an insert is done.
Thus we keep a normal equilibrium.

This patch :
- renames rt_hash_rnd to rt_genid (and makes it an atomic_t)
- Adds a new rt_genid field to 'struct rtable' (filling a hole on 64bit)
- Checks entry->rt_genid at appropriate places :
2008-01-31 19:28:27 -08:00
Chris Leech
e83a2ea850 [VLAN]: set_rx_mode support for unicast address list
Reuse the existing logic for multicast list synchronization for the
unicast address list. The core of dev_mc_sync/unsync are split out as
__dev_addr_sync/unsync and moved from dev_mcast.c to dev.c.  These are
then used to implement dev_unicast_sync/unsync as well.

I'm working on cleaning up Intel's FCoE stack, which generates new MAC
addresses from the fibre channel device id assigned by the fabric as
per the current draft specification in T11.  When using such a
protocol in a VLAN environment it would be nice to not always be
forced into promiscuous mode, assuming the underlying Ethernet driver
supports multiple unicast addresses as well.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-01-31 19:28:24 -08:00
Stephen Hemminger
71d67e666e [IPV4] fib_trie: rescan if key is lost during dump
Normally during a dump the key of the last dumped entry is used for
continuation, but since lock is dropped it might be lost. In that case
fallback to the old counter based N^2 behaviour.  This means the dump
will end up skipping some routes which matches what FIB_HASH does.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:23 -08:00
Pavel Emelyanov
d86e0dac2c [NETNS]: Tcp-v6 sockets per-net lookup.
Add a net argument to inet6_lookup and propagate it further.
Actually, this is tcp-v6 implementation of what was done for
tcp-v4 sockets in a previous patch.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:20 -08:00