This patch improves ctnetlink event reliability if one broadcast
listener has set the NETLINK_BROADCAST_ERROR socket option.
The logic is the following: if an event delivery fails, we keep
the undelivered events in the missed event cache. Once the next
packet arrives, we add the new events (if any) to the missed
events in the cache and we try a new delivery, and so on. Thus,
if ctnetlink fails to deliver an event, we try to deliver them
once we see a new packet. Therefore, we may lose state
transitions but the userspace process gets in sync at some point.
At worst case, if no events were delivered to userspace, we make
sure that destroy events are successfully delivered. Basically,
if ctnetlink fails to deliver the destroy event, we remove the
conntrack entry from the hashes and we insert them in the dying
list, which contains inactive entries. Then, the conntrack timer
is added with an extra grace timeout of random32() % 15 seconds
to trigger the event again (this grace timeout is tunable via
/proc). The use of a limited random timeout value allows
distributing the "destroy" resends, thus, avoiding accumulating
lots "destroy" events at the same time. Event delivery may
re-order but we can identify them by means of the tuple plus
the conntrack ID.
The maximum number of conntrack entries (active or inactive) is
still handled by nf_conntrack_max. Thus, we may start dropping
packets at some point if we accumulate a lot of inactive conntrack
entries that did not successfully report the destroy event to
userspace.
During my stress tests consisting of setting a very small buffer
of 2048 bytes for conntrackd and the NETLINK_BROADCAST_ERROR socket
flag, and generating lots of very small connections, I noticed
very few destroy entries on the fly waiting to be resend.
A simple way to test this patch consist of creating a lot of
entries, set a very small Netlink buffer in conntrackd (+ a patch
which is not in the git tree to set the BROADCAST_ERROR flag)
and invoke `conntrack -F'.
For expectations, no changes are introduced in this patch.
Currently, event delivery is only done for new expectations (no
events from expectation expiration, removal and confirmation).
In that case, they need a per-expectation event cache to implement
the same idea that is exposed in this patch.
This patch can be useful to provide reliable flow-accouting. We
still have to add a new conntrack extension to store the creation
and destroy time.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch adds the hlist_nulls_add_head() function which is
based on hlist_nulls_add_head_rcu() but without the use of
rcu_assign_pointer(). It also adds hlist_nulls_del which is
exactly the same like hlist_nulls_del_rcu().
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch moves the helper destruction to a function that lives
in nf_conntrack_helper.c. This new function is used in the patch
to add ctnetlink reliable event delivery.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch reworks the per-cpu event caching to use the conntrack
extension infrastructure.
The main drawback is that we consume more memory per conntrack
if event delivery is enabled. This patch is required by the
reliable event delivery that follows to this patch.
BTW, this patch allows you to enable/disable event delivery via
/proc/sys/net/netfilter/nf_conntrack_events in runtime, although
you can still disable event caching as compilation option.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Use mod_timer_pending() instead of atomic sequence of del_timer()/
add_timer(). mod_timer_pending() does not rearm an inactive timer,
so we don't need the conntrack lock anymore to make sure we don't
accidentally rearm a timer of a conntrack which is in the process
of being destroyed.
With this change, we don't need to take the global lock anymore at all,
counter updates can be performed under the per-conntrack lock.
Signed-off-by: Patrick McHardy <kaber@trash.net>
commit 3f68535ada (clocksource: sanity check sysfs clocksource
changes) prevents selection of non high resolution capable
clocksources when high resolution mode is active, but did not take
into account that the same rules apply for highres=off nohz=on.
Check the tick device mode instead of hrtimer_hres_active() to verify
whether the system needs to be protected from a switch to jiffies or
other non highres capable clock sources.
Reported-by: Luming Yu <luming.yu@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
There is sometimes a need for the ocores driver to add devices to the
bus when installed.
i2c_register_board_info can not always be used, because the I2C devices
are not known at an early state, they could for instance be connected
on a I2C bus on a PCI device which has the Open Cores IP.
i2c_new_device can not be used in all cases either since the resulting
bus nummer might be unknown.
The solution is the pass a list of I2C devices in the platform data to
the Open Cores driver. This is useful for MFD drivers.
Signed-off-by: Richard Röjfors <richard.rojfors.ext@mocean-labs.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Change to using platform id table to match either of the two supported
platform device names in the driver. This simplifies the driver init and
exit code
Note, log messages will now be prefixed with 's3c-i2c' instead of the
driver name, so output will be of the form of:
s3c-i2c s3c2440-i2c.0: slave address 0x10
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Use longer noise filter period for fast and standard mode. Based on an
earlier patch by Eero Nurkkala.
Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Fix scll/sclh calculations for HS and fast modes. Currently the driver
uses equal (roughly) low/high times which will result in too short
low time.
OMAP3430 TRM gives the following equations:
F/S: tLow = (scll + 7) * internal_clk
tHigh = (sclh + 5) * internal_clk
HS: tLow = (scll + 7) * fclk
tHigh = (sclh + 5) * fclk
Furthermore, the I2C specification sets the following minimum values
for HS tLow/tHigh for capacitive bus loads 100 pF (maximum speed 3400)
and 400 pF (maximum speed 1700):
speed tLow tHigh
3400 160 ns 60 ns
1700 320 ns 120 ns
and for F/S:
speed tLow tHigh
400 1300 ns 600 ns
100 4700 ns 4000 ns
By using duty cycles 33/66 (HS, F) and 50/50 (S) we stay above these
minimum values.
Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Some drivers need i2c_smbus_read_i2c_block_data() functionality, so add
support for it to the Blackfin I2C bus driver.
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
[ben-linux@fluff.org: shortened subject]
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
We have a custom BF537 board with an I2C RTC (MAX DS3231) running
uclinux 2007R1 for some time. Recently during migration to 2008R1.5-RC3
we losted access to the RTC. The RTC driver calls 'i2c_transfer()' which
in turns calls 'bfin_twi_master_xfer()' in i2c-bfin-twi.c.
Compared with 2007R1, it looks like the 2008R1.5 version of i2c-bin-twi.c
has a new mode 'TWI_I2C-MODE_REPEAT' which corresponds to the Repeat Start
Condition described in the HRM. However, according to the HRM, at XMIT or
RECV interrupt and when the data count is 0, not only is the RESTART bit
supposed to be set, but MDIR must also be set if the next operation is a
receive sequence, and cleared if not. Currently there is no code that looks
at the I2C_M_RD bit in the flag from the next cur_msg and set/clear the MDIR
flag accordingly at the same time that the RSTART bit is set. Instead, MDIR
is set or cleared (by OR'ing with 0?) after the RESTART bit has been cleared
during handling of MCOMP interrupt.
It appears that this is causing our failure with reading the RTC, as a
quick patch to set/clear MDIR when RESTART is set seem to solve our problem.
Signed-off-by: Frank Shew <fshew@geometrics.com>
Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
[ben-linux@fluff.org: shorted subject]
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Avoid rewrite TWI MASTER_CTL reg when issue next message
In i2c repeat transfer mode, byte count of next message should be filled
into part of the TWI MASTER_CTL reg when interrupt MCOMP of last
message transfer is triggered. But, other bits in this reg should
not be touched.
Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
[ben-linux@fluff.org: shorted subject]
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Make sure we don't end up with an invalid CLKDIV=0 in case someone
specifies 20kHz SCL or less (5 * 1024 / 20 = 0x100).
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
[ben-linux@fluff.org: shortened subject line]
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Convert magic values 1 and -1 to NETDEV_TX_BUSY and NETDEV_TX_LOCKED respectively.
0 (NETDEV_TX_OK) is not changed to keep the noise down, except in very few cases
where its in direct proximity to one of the other values.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix up USB drivers that return an errno value (result of usb_submit_urb())
to qdisc_restart(), causing qdisc_restart() to print a warning and requeue/
retransmit the skb.
- hso: skb is freed: use after free
- at76_usb: skb is freed: use after free
Compile tested only.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix up ATM drivers that return an errno value to qdisc_restart(), causing
qdisc_restart() to print a warning an requeue/retransmit the skb.
- lec: condition can only be remedied by userspace, until that retransmissions
Compile tested only.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix up hamradio drivers that return an errno value to dev_queue_xmit(), causing
it to print a warning an free the skb.
- bpqether: skb is freed: use after free
Compile tested only.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix up s390 drivers that return an errno value to qdisc_restart(), causing
qdisc_restart() to print a warning an requeue/retransmit the skb.
- claw: impossible condition, simply remove it
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix up WAN drivers that return an errno value to qdisc_restart(), causing
qdisc_restart() to print a warning an requeue/retransmit the skb.
- cycx_x25: intention appears to be to requeue the skb
Does not compile cleanly for me even without this patch, so untested.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
net: fix network drivers ndo_start_xmit() return values (part 3)
Fix up wireless drivers that return an errno value to qdisc_restart(), causing
qdisc_restart() to print a warning an requeue/retransmit the skb.
- airo: transmission not implemented for chip, intention is to free and abort
- ipw2200: transmission not implemented for promiscous mode, intention is to
drop
- prism54: intention is to drop
- wl3501_cs: intention appears to be to drop
- zd1201: error counter indicates intention is to drop
All drivers compile tested.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix up IRDA drivers that return an errno value to qdisc_restart(), causing
qdisc_restart() to print a warning an requeue/retransmit the skb.
- donauboe: intention appears to be to have the skb retransmitted without
error message
- irda-usb: intention is to drop silently according to comment
- kingsub-sir: skb is freed: use after free
- ks959-sir: skb is freed: use after free
- ksdazzle-sir: skb is freed: use after free
- mcs7880: skb is freed: use after free
All but donauboe compile tested.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix up drivers that return an errno value to qdisc_restart(), causing
qdisc_restart() to print a warning and requeue/retransmit the skb.
- xpnet: memory allocation error, intention is to drop
- ethoc: oversized packet, packet must be dropped
- ibmlana: skb freed: use after free
- rrunner: skb freed: use after free
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Identation got messed up when merging the current_umask changes with
the generic ACL support.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
Fix warnings about unitialized dquot variables by making sure
xfs_qm_vop_dqalloc touches it even when quotas are disabled.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/configfs:
configfs: Rework configfs_depend_item() locking and make lockdep happy
configfs: Silence lockdep on mkdir() and rmdir()
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: (30 commits)
[S390] wire up sys_perf_counter_open
[S390] wire up sys_rt_tgsigqueueinfo
[S390] ftrace: add system call tracer support
[S390] ftrace: add function graph tracer support
[S390] ftrace: add function trace mcount test support
[S390] ftrace: add dynamic ftrace support
[S390] kprobes: use probe_kernel_write
[S390] maccess: arch specific probe_kernel_write() implementation
[S390] maccess: add weak attribute to probe_kernel_write
[S390] profile_tick called twice
[S390] dasd: forward internal errors to dasd_sleep_on caller
[S390] dasd: sync after async probe
[S390] dasd: check_characteristics cleanup
[S390] dasd: no High Performance FICON in 31-bit mode
[S390] dcssblk: revert devt conversion
[S390] qdio: fix access beyond ARRAY_SIZE of irq_ptr->{in,out}put_qs
[S390] vmalloc: add vmalloc kernel parameter support
[S390] uaccess: use might_fault() instead of might_sleep()
[S390] 3270: lock dependency fixes
[S390] 3270: do not register with tty_register_device
...
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (50 commits)
drm: include kernel list header file in hashtab header
drm: Export hash table functionality.
drm: Split out the mm declarations in a separate header. Add atomic operations.
drm/radeon: add support for RV790.
drm/radeon: add rv740 drm support.
drm_calloc_large: check right size, check integer overflow, use GFP_ZERO
drm: Eliminate magic I2C frobbing when reading EDID
drm/i915: duplicate desired mode for use by fbcon.
drm/via: vfree() no need checking before calling it
drm: Replace DRM_DEBUG with DRM_DEBUG_DRIVER in i915 driver
drm: Replace DRM_DEBUG with DRM_DEBUG_MODE in drm_mode
drm/i915: Replace DRM_DEBUG with DRM_DEBUG_KMS in intel_sdvo
drm/i915: replace DRM_DEBUG with DRM_DEBUG_KMS in intel_lvds
drm: add separate drm debugging levels
radeon: remove _DRM_DRIVER from the preadded sarea map
drm: don't associate _DRM_DRIVER maps with a master
drm: simplify kcalloc() call to kzalloc().
intelfb: fix spelling of "CLOCK"
drm: fix LOCK_TEST_WITH_RETURN macro
drm/i915: Hook connector to encoder during load detection (fixes tv/vga detect)
...
We now have an IrDA git tree on kernel.org:
git://git.kernel.org/pub/scm/linux/kernel/git/sameo/irda-2.6.git
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
The sir retries count reaches -1 rather than 0.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Graff Yang <graff.yang@gmail.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Bryan Wu <bryan.wu@analog.com>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm:
dlm: use more NOFS allocation
dlm: connect to nodes earlier
dlm: fix use count with multiple joins
dlm: Make name input parameter of {,dlm_}new_lockspace() const
* 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf_counter: Start documenting HAVE_PERF_COUNTERS requirements
perf_counter: Add forward/backward attribute ABI compatibility
perf record: Explicity program a default counter
perf_counter: Remove PERF_TYPE_RAW special casing
perf_counter: PERF_TYPE_HW_CACHE is a hardware counter too
powerpc, perf_counter: Fix performance counter event types
perf_counter/x86: Add a quirk for Atom processors
perf_counter tools: Remove one L1-data alias
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/vapier/blackfin: (62 commits)
Blackfin: fix sparseirq/kstat_irqs fallout
Blackfin: fix unused warnings after nommu update
Blackfin: export the last exception cause via debugfs
Blackfin: fix length checking in kgdb_ebin2mem
Blackfin: kgdb: fix up error return values
Blackfin: push access_ok() L1 attribute down
Blackfin: punt duplicated search_exception_table() prototype
Blackfin: add missing access_ok() checks to user functions
Blackfin: convert early_printk EVT init to a loop
Blackfin: document the lsl variants of the L1 allocator
Blackfin: rename Blackfin relocs according to the toolchain
Blackfin: check SIC defines rather than variant names
Blackfin: add SSYNC to set_dma_sg() for descriptor fetching
Blackfin: convert SMP to only use generic time framework
Blackfin: bf548-ezkit/bf537-stamp: add resources for ADXL345/346
Blackfin: override default uClinux MTD addr/size
Blackfin: fix command line corruption with DEBUG_DOUBLEFAULT
Blackfin: fix handling of initial L1 reservation
Blackfin: merge sram init functions
Blackfin: drop unused reserve_pda() function
...
* 'for-linus' of git://linux-arm.org/linux-2.6:
kmemleak: Add more info to the MAINTAINERS entry
kmemleak: Remove the kmemleak.h include in drivers/char/vt.c
git commit 0a0c5168 "PM: Introduce functions for suspending and resuming
device interrupts" introduced some helper functions. However these
functions are only available for architectures which support
GENERIC_HARDIRQS.
Other architectures will see this build error:
drivers/built-in.o: In function `sysdev_suspend':
(.text+0x15138): undefined reference to `check_wakeup_irqs'
drivers/built-in.o: In function `device_power_up':
(.text+0x1cb66): undefined reference to `resume_device_irqs'
drivers/built-in.o: In function `device_power_down':
(.text+0x1cb92): undefined reference to `suspend_device_irqs'
To fix this add some empty inline functions for !GENERIC_HARDIRQS.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
The *_nvs_* routines in swsusp.c make use of the io*map()
functions, which are only provided for HAS_IOMEM, thus
breaking compilation if HAS_IOMEM is not set. Fix this
by moving the *_nvs_* routines into hibernate_nvs.c, which
is only compiled if HAS_IOMEM is set.
[rjw: Change the name of the new file to hibernate_nvs.c, add the
license line to the header comment.]
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Change the name of kernel/power/disk.c to kernel/power/hibernate.c
in analogy with the file names introduced by the changes that
separated the suspend to RAM and standby funtionality from the
common PM functions.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Move the suspend to RAM and standby code from kernel/power/main.c
to two separate files, kernel/power/suspend.c containing the basic
functions and kernel/power/suspend_test.c containing the automatic
suspend test facility based on the RTC clock alarm.
There are no changes in functionality related to these modifications.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
This patch reworks the platform driver code for legacy
suspend and resume to avoid installing callbacks in
struct device_driver. A warning is also added telling
users to update the platform driver to use dev_pm_ops.
The functions platform_legacy_suspend()/resume() directly
call suspend and resume callbacks in struct platform_driver
instead of wrapping things in platform_drv_suspend()/resume().
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
This patch removes the legacy callbacks ->suspend() and
->resume() from struct device_type. These callbacks seem
unused, and new code should instead make use of struct
dev_pm_ops.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
A future patch is going to modify the memory shrinking code so that
it will make memory allocations to free memory instead of using an
artificial memory shrinking mechanism for that. For this purpose it
is convenient to move swsusp_shrink_memory() from
kernel/power/swsusp.c to kernel/power/snapshot.c, because the new
memory-shrinking code is going to use things that are local to
kernel/power/snapshot.c .
[rev. 2: Make some functions static and remove their headers from
kernel/power/power.h]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Wu Fengguang <fengguang.wu@intel.com>
Remove the shrinking of memory from the suspend-to-RAM code, where
it is not really necessary.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Nigel Cunningham <nigel@tuxonice.net>
Acked-by: Wu Fengguang <fengguang.wu@intel.com>