kernel_optimize_test

Author	SHA1	Message	Date
Vasiliy Kulikov	8909c9ad8f	net: don't allow CAP_NET_ADMIN to load non-netdev kernel modules Since `a8f80e8ff9` any process with CAP_NET_ADMIN may load any module from /lib/modules/. This doesn't mean that CAP_NET_ADMIN is a superset of CAP_SYS_MODULE as modules are limited to /lib/modules/**. However, CAP_NET_ADMIN capability shouldn't allow anybody load any module not related to networking. This patch restricts an ability of autoloading modules to netdev modules with explicit aliases. This fixes CVE-2011-1019. Arnd Bergmann suggested to leave untouched the old pre-v2.6.32 behavior of loading netdev modules by name (without any prefix) for processes with CAP_SYS_MODULE to maintain the compatibility with network scripts that use autoloading netdev modules by aliases like "eth0", "wlan0". Currently there are only three users of the feature in the upstream kernel: ipip, ip_gre and sit. root@albatros:~# capsh --drop=$(seq -s, 0 11),$(seq -s, 13 34) -- root@albatros:~# grep Cap /proc/$$/status CapInh: 0000000000000000 CapPrm: fffffff800001000 CapEff: fffffff800001000 CapBnd: fffffff800001000 root@albatros:~# modprobe xfs FATAL: Error inserting xfs (/lib/modules/2.6.38-rc6-00001-g2bf4ca3/kernel/fs/xfs/xfs.ko): Operation not permitted root@albatros:~# lsmod \| grep xfs root@albatros:~# ifconfig xfs xfs: error fetching interface information: Device not found root@albatros:~# lsmod \| grep xfs root@albatros:~# lsmod \| grep sit root@albatros:~# ifconfig sit sit: error fetching interface information: Device not found root@albatros:~# lsmod \| grep sit root@albatros:~# ifconfig sit0 sit0 Link encap:IPv6-in-IPv4 NOARP MTU:1480 Metric:1 root@albatros:~# lsmod \| grep sit sit 10457 0 tunnel4 2957 1 sit For CAP_SYS_MODULE module loading is still relaxed: root@albatros:~# grep Cap /proc/$$/status CapInh: 0000000000000000 CapPrm: ffffffffffffffff CapEff: ffffffffffffffff CapBnd: ffffffffffffffff root@albatros:~# ifconfig xfs xfs: error fetching interface information: Device not found root@albatros:~# lsmod \| grep xfs xfs 745319 0 Reference: https://lkml.org/lkml/2011/2/24/203 Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Kees Cook <kees.cook@canonical.com> Signed-off-by: James Morris <jmorris@namei.org>	2011-03-10 10:25:19 +11:00
Linus Torvalds	fb62c00a6d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: no .snap inside of snapped namespace libceph: fix msgr standby handling libceph: fix msgr keepalive flag libceph: fix msgr backoff libceph: retry after authorization failure libceph: fix handling of short returns from get_user_pages ceph: do not clear I_COMPLETE from d_release ceph: do not set I_COMPLETE Revert "ceph: keep reference to parent inode on ceph_dentry"	2011-03-05 10:43:22 -08:00
Sage Weil	e00de341fd	libceph: fix msgr standby handling The standby logic used to be pretty dependent on the work requeueing behavior that changed when we switched to WQ_NON_REENTRANT. It was also very fragile. Restructure things so that: - We clear WRITE_PENDING when we set STANDBY. This ensures we will requeue work when we wake up later. - con_work backs off if STANDBY is set. There is nothing to do if we are in standby. - clear_standby() helper is called by both con_send() and con_keepalive(), the two actions that can wake us up again. Move the connect_seq++ logic here. Signed-off-by: Sage Weil <sage@newdream.net>	2011-03-04 12:25:05 -08:00
Sage Weil	e76661d0a5	libceph: fix msgr keepalive flag There was some broken keepalive code using a dead variable. Shift to using the proper bit flag. Signed-off-by: Sage Weil <sage@newdream.net>	2011-03-04 12:24:31 -08:00
Sage Weil	60bf8bf881	libceph: fix msgr backoff With commit f363e45f we replaced a bunch of hacky workqueue mutual exclusion logic with the WQ_NON_REENTRANT flag. One pieces of fallout is that the exponential backoff breaks in certain cases: * con_work attempts to connect. * we get an immediate failure, and the socket state change handler queues immediate work. * con_work calls con_fault, we decide to back off, but can't queue delayed work. In this case, we add a BACKOFF bit to make con_work reschedule delayed work next time it runs (which should be immediately). Signed-off-by: Sage Weil <sage@newdream.net>	2011-03-04 12:24:28 -08:00
Linus Torvalds	b65a0e0c84	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: DNS: Fix a NULL pointer deref when trying to read an error key [CVE-2011-1076]	2011-03-03 15:48:01 -08:00
Linus Torvalds	4438a02fc4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (42 commits) MAINTAINERS: Add Andy Gospodarek as co-maintainer. r8169: disable ASPM RxRPC: Fix v1 keys AF_RXRPC: Handle receiving ACKALL packets cnic: Fix lost interrupt on bnx2x cnic: Prevent status block race conditions with hardware net: dcbnl: check correct ops in dcbnl_ieee_set() e1000e: disable broken PHY wakeup for ICH10 LOMs, use MAC wakeup instead igb: fix sparse warning e1000: fix sparse warning netfilter: nf_log: avoid oops in (un)bind with invalid nfproto values dccp: fix oops on Reset after close ipvs: fix dst_lock locking on dest update davinci_emac: Add Carrier Link OK check in Davinci RX Handler bnx2x: update driver version to 1.62.00-6 bnx2x: properly calculate lro_mss bnx2x: perform statistics "action" before state transition. bnx2x: properly configure coefficients for MinBW algorithm (NPAR mode). bnx2x: Fix ethtool -t link test for MF (non-pmf) devices. bnx2x: Fix nvram test for single port devices. ...	2011-03-03 15:43:15 -08:00
David Howells	1362fa078d	DNS: Fix a NULL pointer deref when trying to read an error key [CVE-2011-1076] When a DNS resolver key is instantiated with an error indication, attempts to read that key will result in an oops because user_read() is expecting there to be a payload - and there isn't one [CVE-2011-1076]. Give the DNS resolver key its own read handler that returns the error cached in key->type_data.x[0] as an error rather than crashing. Also make the kenter() at the beginning of dns_resolver_instantiate() limit the amount of data it prints, since the data is not necessarily NUL-terminated. The buggy code was added in: commit `4a2d789267` Author: Wang Lei <wang840925@gmail.com> Date: Wed Aug 11 09:37:58 2010 +0100 Subject: DNS: If the DNS server returns an error, allow that to be cached [ver #2] This can trivially be reproduced by any user with the following program compiled with -lkeyutils: #include <stdlib.h> #include <keyutils.h> #include <err.h> static char payload[] = "#dnserror=6"; int main() { key_serial_t key; key = add_key("dns_resolver", "a", payload, sizeof(payload), KEY_SPEC_SESSION_KEYRING); if (key == -1) err(1, "add_key"); if (keyctl_read(key, NULL, 0) == -1) err(1, "read_key"); return 0; } What should happen is that keyctl_read() reports error 6 (ENXIO) to the user: dns-break: read_key: No such device or address but instead the kernel oopses. This cannot be reproduced with the 'keyutils add' or 'keyutils padd' commands as both of those cut the data down below the NUL termination that must be included in the data. Without this dns_resolver_instantiate() will return -EINVAL and the key will not be instantiated such that it can be read. The oops looks like: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 IP: [<ffffffff811b99f7>] user_read+0x4f/0x8f PGD 3bdf8067 PUD 385b9067 PMD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/pci0000:00/0000:00:19.0/irq CPU 0 Modules linked in: Pid: 2150, comm: dns-break Not tainted 2.6.38-rc7-cachefs+ #468 /DG965RY RIP: 0010:[<ffffffff811b99f7>] [<ffffffff811b99f7>] user_read+0x4f/0x8f RSP: 0018:ffff88003bf47f08 EFLAGS: 00010246 RAX: 0000000000000001 RBX: ffff88003b5ea378 RCX: ffffffff81972368 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88003b5ea378 RBP: ffff88003bf47f28 R08: ffff88003be56620 R09: 0000000000000000 R10: 0000000000000395 R11: 0000000000000002 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ffffffffffffffa1 FS: 00007feab5751700(0000) GS:ffff88003e000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 000000003de40000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process dns-break (pid: 2150, threadinfo ffff88003bf46000, task ffff88003be56090) Stack: ffff88003b5ea378 ffff88003b5ea3a0 0000000000000000 0000000000000000 ffff88003bf47f68 ffffffff811b708e ffff88003c442bc8 0000000000000000 00000000004005a0 00007fffba368060 0000000000000000 0000000000000000 Call Trace: [<ffffffff811b708e>] keyctl_read_key+0xac/0xcf [<ffffffff811b7c07>] sys_keyctl+0x75/0xb6 [<ffffffff81001f7b>] system_call_fastpath+0x16/0x1b Code: 75 1f 48 83 7b 28 00 75 18 c6 05 58 2b fb 00 01 be bb 00 00 00 48 c7 c7 76 1c 75 81 e8 13 c2 e9 ff 4c 8b b3 e0 00 00 00 4d 85 ed <41> 0f b7 5e 10 74 2d 4d 85 e4 74 28 e8 98 79 ee ff 49 39 dd 48 RIP [<ffffffff811b99f7>] user_read+0x4f/0x8f RSP <ffff88003bf47f08> CR2: 0000000000000010 Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com> cc: Wang Lei <wang840925@gmail.com> Signed-off-by: James Morris <jmorris@namei.org>	2011-03-04 09:56:19 +11:00
Sage Weil	692d20f576	libceph: retry after authorization failure If we mark the connection CLOSED we will give up trying to reconnect to this server instance. That is appropriate for things like a protocol version mismatch that won't change until the server is restarted, at which point we'll get a new addr and reconnect. An authorization failure like this is probably due to the server not properly rotating it's secret keys, however, and should be treated as transient so that the normal backoff and retry behavior kicks in. Signed-off-by: Sage Weil <sage@newdream.net>	2011-03-03 13:47:40 -08:00
Sage Weil	38815b7802	libceph: fix handling of short returns from get_user_pages get_user_pages() can return fewer pages than we ask for. We were returning a bogus pointer/error code in that case. Instead, loop until we get all the pages we want or get an error we can return to the caller. Signed-off-by: Sage Weil <sage@newdream.net>	2011-03-03 13:47:39 -08:00
David Howells	1000345347	AF_RXRPC: Handle receiving ACKALL packets The OpenAFS server is now sending ACKALL packets, so we need to handle them. Otherwise we report a protocol error and abort. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-02 22:18:52 -08:00
John Fastabend	f3d7bc57c7	net: dcbnl: check correct ops in dcbnl_ieee_set() The incorrect ops routine was being tested for in DCB_ATTR_IEEE_PFC attributes. This patch corrects it. Currently, every driver implementing ieee_setets also implements ieee_setpfc so this bug is not actualized yet. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-02 15:04:33 -08:00
David S. Miller	88d2d28b18	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6	2011-03-02 11:29:31 -08:00
Jan Engelhardt	9ef0298a8e	netfilter: nf_log: avoid oops in (un)bind with invalid nfproto values Like many other places, we have to check that the array index is within allowed limits, or otherwise, a kernel oops and other nastiness can ensue when we access memory beyond the end of the array. [ 5954.115381] BUG: unable to handle kernel paging request at 0000004000000000 [ 5954.120014] IP: __find_logger+0x6f/0xa0 [ 5954.123979] nf_log_bind_pf+0x2b/0x70 [ 5954.123979] nfulnl_recv_config+0xc0/0x4a0 [nfnetlink_log] [ 5954.123979] nfnetlink_rcv_msg+0x12c/0x1b0 [nfnetlink] ... The problem goes back to v2.6.30-rc1~1372~1342~31 where nf_log_bind was decoupled from nf_log_register. Reported-by: Miguel Di Ciurcio Filho <miguel.filho@gmail.com>, via irc.freenode.net/#netfilter Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>	2011-03-02 12:10:13 +01:00
Gerrit Renker	720dc34bbb	dccp: fix oops on Reset after close This fixes a bug in the order of dccp_rcv_state_process() that still permitted reception even after closing the socket. A Reset after close thus causes a NULL pointer dereference by not preventing operations on an already torn-down socket. dccp_v4_do_rcv() \| \| state other than OPEN v dccp_rcv_state_process() \| \| DCCP_PKT_RESET v dccp_rcv_reset() \| v dccp_time_wait() WARNING: at net/ipv4/inet_timewait_sock.c:141 __inet_twsk_hashdance+0x48/0x128() Modules linked in: arc4 ecb carl9170 rt2870sta(C) mac80211 r8712u(C) crc_ccitt ah [<c0038850>] (unwind_backtrace+0x0/0xec) from [<c0055364>] (warn_slowpath_common) [<c0055364>] (warn_slowpath_common+0x4c/0x64) from [<c0055398>] (warn_slowpath_n) [<c0055398>] (warn_slowpath_null+0x1c/0x24) from [<c02b72d0>] (__inet_twsk_hashd) [<c02b72d0>] (__inet_twsk_hashdance+0x48/0x128) from [<c031caa0>] (dccp_time_wai) [<c031caa0>] (dccp_time_wait+0x40/0xc8) from [<c031c15c>] (dccp_rcv_state_proces) [<c031c15c>] (dccp_rcv_state_process+0x120/0x538) from [<c032609c>] (dccp_v4_do_) [<c032609c>] (dccp_v4_do_rcv+0x11c/0x14c) from [<c0286594>] (release_sock+0xac/0) [<c0286594>] (release_sock+0xac/0x110) from [<c031fd34>] (dccp_close+0x28c/0x380) [<c031fd34>] (dccp_close+0x28c/0x380) from [<c02d9a78>] (inet_release+0x64/0x70) The fix is by testing the socket state first. Receiving a packet in Closed state now also produces the required "No connection" Reset reply of RFC 4340, 8.3.1. Reported-and-tested-by: Johan Hovold <jhovold@gmail.com> Cc: stable@kernel.org Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-01 23:02:07 -08:00
Julian Anastasov	ff75f40f44	ipvs: fix dst_lock locking on dest update Fix dst_lock usage in __ip_vs_update_dest. We need _bh locking because destination is updated in user context. Can cause lockups on frequent destination updates. Problem reported by Simon Kirby. Bug was introduced in 2.6.37 from the "ipvs: changes for local real server" change. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Simon Horman <horms@verge.net.au>	2011-03-02 07:54:41 +09:00
Andrey Vagin	b44d211e16	netlink: handle errors from netlink_dump() netlink_dump() may failed, but nobody handle its error. It generates output data, when a previous portion has been returned to user space. This mechanism works when all data isn't go in skb. If we enter in netlink_recvmsg() and skb is absent in the recv queue, the netlink_dump() will not been executed. So if netlink_dump() is failed one time, the new data never appear and the reader will sleep forever. netlink_dump() is called from two places: 1. from netlink_sendmsg->...->netlink_dump_start(). In this place we can report error directly and it will be returned by sendmsg(). 2. from netlink_recvmsg There we can't report error directly, because we have a portion of valid output data and call netlink_dump() for prepare the next portion. If netlink_dump() is failed, the socket will be mark as error and the next recvmsg will be failed. Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-28 12:18:12 -08:00
Hagen Paul Pfeifer	5aca1a9e88	net: handle addr_type of 0 properly addr_type of 0 means that the type should be adopted from from_dev and not from __hw_addr_del_multiple(). Unfortunately it isn't so and addr_type will always be considered. Fix this by implementing the considered and documented behavior. Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-25 13:58:54 -08:00
Anton Blanchard	0a93ea2e89	RxRPC: Allocate tokens with kzalloc to avoid oops in rxrpc_destroy With slab poisoning enabled, I see the following oops: Unable to handle kernel paging request for data at address 0x6b6b6b6b6b6b6b73 ... NIP [c0000000006bc61c] .rxrpc_destroy+0x44/0x104 LR [c0000000006bc618] .rxrpc_destroy+0x40/0x104 Call Trace: [c0000000feb2bc00] [c0000000006bc618] .rxrpc_destroy+0x40/0x104 (unreliable) [c0000000feb2bc90] [c000000000349b2c] .key_cleanup+0x1a8/0x20c [c0000000feb2bd40] [c0000000000a2920] .process_one_work+0x2f4/0x4d0 [c0000000feb2be00] [c0000000000a2d50] .worker_thread+0x254/0x468 [c0000000feb2bec0] [c0000000000a868c] .kthread+0xbc/0xc8 [c0000000feb2bf90] [c000000000020e00] .kernel_thread+0x54/0x70 We aren't initialising token->next, but the code in destroy_context relies on the list being NULL terminated. Use kzalloc to zero out all the fields. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-02-25 11:12:37 -08:00
Lucian Adrian Grijincu	c486da3439	sysctl: ipv6: use correct net in ipv6_sysctl_rtcache_flush Before this patch issuing these commands: fd = open("/proc/sys/net/ipv6/route/flush") unshare(CLONE_NEWNET) write(fd, "stuff") would flush the newly created net, not the original one. The equivalent ipv4 code is correct (stores the net inside ->extra1). Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-25 11:01:56 -08:00
Linus Torvalds	ef3242859f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (33 commits) Added support for usb ethernet (0x0fe6, 0x9700) r8169: fix RTL8168DP power off issue. r8169: correct settings of rtl8102e. r8169: fix incorrect args to oob notify. DM9000B: Fix PHY power for network down/up DM9000B: Fix reg_save after spin_lock in dm9000_timeout net_sched: long word align struct qdisc_skb_cb data sfc: lower stack usage in efx_ethtool_self_test bridge: Use IPv6 link-local address for multicast listener queries bridge: Fix MLD queries' ethernet source address bridge: Allow mcast snooping for transient link local addresses too ipv6: Add IPv6 multicast address flag defines bridge: Add missing ntohs()s for MLDv2 report parsing bridge: Fix IPv6 multicast snooping by correcting offset in MLDv2 report bridge: Fix IPv6 multicast snooping by storing correct protocol type p54pci: update receive dma buffers before and after processing fix cfg80211_wext_siwfreq lock ordering... rt2x00: Fix WPA TKIP Michael MIC failures. ath5k: Fix fast channel switching tcp: undo_retrans counter fixes ...	2011-02-23 16:02:00 -08:00
David S. Miller	d3bd1b4c89	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6	2011-02-22 11:53:05 -08:00
Linus Lüssing	fe29ec41aa	bridge: Use IPv6 link-local address for multicast listener queries Currently the bridge multicast snooping feature periodically issues IPv6 general multicast listener queries to sense the absence of a listener. For this, it uses :: as its source address - however RFC 2710 requires: "To be valid, the Query message MUST come from a link-local IPv6 Source Address". Current Linux kernel versions seem to follow this requirement and ignore our bogus MLD queries. With this commit a link local address from the bridge interface is being used to issue the MLD query, resulting in other Linux devices which are multicast listeners in the network to respond with a MLD response (which was not the case before). Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-22 10:07:29 -08:00
Linus Lüssing	36cff5a10c	bridge: Fix MLD queries' ethernet source address Map the IPv6 header's destination multicast address to an ethernet source address instead of the MLD queries multicast address. For instance for a general MLD query (multicast address in the MLD query set to ::), this would wrongly be mapped to 33:33:00:00:00:00, although an MLD queries destination MAC should always be 33:33:00:00:00:01 which matches the IPv6 header's multicast destination ff02::1. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-22 10:07:28 -08:00
Linus Lüssing	e4de9f9e83	bridge: Allow mcast snooping for transient link local addresses too Currently the multicast bridge snooping support is not active for link local multicast. I assume this has been done to leave important multicast data untouched, like IPv6 Neighborhood Discovery. In larger, bridged, local networks it could however be desirable to optimize for instance local multicast audio/video streaming too. With the transient flag in IPv6 multicast addresses we have an easy way to optimize such multimedia traffic without tempering with the high priority multicast data from well-known addresses. This patch alters the multicast bridge snooping for IPv6, to take effect for transient multicast addresses instead of non-link-local addresses. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-22 10:07:28 -08:00
Linus Lüssing	d41db9f3f7	bridge: Add missing ntohs()s for MLDv2 report parsing The nsrcs number is 2 Byte wide, therefore we need to call ntohs() before using it. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-22 10:07:27 -08:00
Linus Lüssing	649e984d00	bridge: Fix IPv6 multicast snooping by correcting offset in MLDv2 report We actually want a pointer to the grec_nsrcr and not the following field. Otherwise we can get very high values for *nsrcs as the first two bytes of the IPv6 multicast address are being used instead, leading to a failing pskb_may_pull() which results in MLDv2 reports not being parsed. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-22 10:07:26 -08:00
Linus Lüssing	9cc6e0c4c4	bridge: Fix IPv6 multicast snooping by storing correct protocol type The protocol type for IPv6 entries in the hash table for multicast bridge snooping is falsely set to ETH_P_IP, marking it as an IPv4 address, instead of setting it to ETH_P_IPV6, which results in negative look-ups in the hash table later. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-22 10:07:26 -08:00
Linus Torvalds	8bd89ca220	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: keep reference to parent inode on ceph_dentry ceph: queue cap_snaps once per realm libceph: fix socket write error handling libceph: fix socket read error handling	2011-02-21 15:01:38 -08:00
Daniel J Blueman	4f919a3bc5	fix cfg80211_wext_siwfreq lock ordering... I previously managed to reproduce a hang while scanning wireless channels (reproducible with airodump-ng hopping channels); subsequent lockdep instrumentation revealed a lock ordering issue. Without knowing the design intent, it looks like the locks should be taken in reverse order; please comment. ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.38-rc5-341cd #4 ------------------------------------------------------- airodump-ng/15445 is trying to acquire lock: (&rdev->devlist_mtx){+.+.+.}, at: [<ffffffff816b1266>] cfg80211_wext_siwfreq+0xc6/0x100 but task is already holding lock: (&wdev->mtx){+.+.+.}, at: [<ffffffff816b125c>] cfg80211_wext_siwfreq+0xbc/0x100 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&wdev->mtx){+.+.+.}: [<ffffffff810a79d6>] lock_acquire+0xc6/0x280 [<ffffffff816d6bce>] mutex_lock_nested+0x6e/0x4b0 [<ffffffff81696080>] cfg80211_netdev_notifier_call+0x430/0x5f0 [<ffffffff8109351b>] notifier_call_chain+0x8b/0x100 [<ffffffff810935b1>] raw_notifier_call_chain+0x11/0x20 [<ffffffff81576d92>] call_netdevice_notifiers+0x32/0x60 [<ffffffff815771a4>] __dev_notify_flags+0x34/0x80 [<ffffffff81577230>] dev_change_flags+0x40/0x70 [<ffffffff8158587c>] do_setlink+0x1fc/0x8d0 [<ffffffff81586042>] rtnl_setlink+0xf2/0x140 [<ffffffff81586923>] rtnetlink_rcv_msg+0x163/0x270 [<ffffffff8159d741>] netlink_rcv_skb+0xa1/0xd0 [<ffffffff815867b0>] rtnetlink_rcv+0x20/0x30 [<ffffffff8159d39a>] netlink_unicast+0x2ba/0x300 [<ffffffff8159dd57>] netlink_sendmsg+0x267/0x3e0 [<ffffffff8155e364>] sock_sendmsg+0xe4/0x110 [<ffffffff8155f3a3>] sys_sendmsg+0x253/0x3b0 [<ffffffff81003192>] system_call_fastpath+0x16/0x1b -> #0 (&rdev->devlist_mtx){+.+.+.}: [<ffffffff810a7222>] __lock_acquire+0x1622/0x1d10 [<ffffffff810a79d6>] lock_acquire+0xc6/0x280 [<ffffffff816d6bce>] mutex_lock_nested+0x6e/0x4b0 [<ffffffff816b1266>] cfg80211_wext_siwfreq+0xc6/0x100 [<ffffffff816b2fad>] ioctl_standard_call+0x5d/0xd0 [<ffffffff816b3223>] T.808+0x163/0x170 [<ffffffff816b326a>] wext_handle_ioctl+0x3a/0x90 [<ffffffff815798d2>] dev_ioctl+0x6f2/0x830 [<ffffffff8155cf3d>] sock_ioctl+0xfd/0x290 [<ffffffff8117dffd>] do_vfs_ioctl+0x9d/0x590 [<ffffffff8117e53a>] sys_ioctl+0x4a/0x80 [<ffffffff81003192>] system_call_fastpath+0x16/0x1b other info that might help us debug this: 2 locks held by airodump-ng/15445: #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff81586782>] rtnl_lock+0x12/0x20 #1: (&wdev->mtx){+.+.+.}, at: [<ffffffff816b125c>] cfg80211_wext_siwfreq+0xbc/0x100 stack backtrace: Pid: 15445, comm: airodump-ng Not tainted 2.6.38-rc5-341cd #4 Call Trace: [<ffffffff810a3f0a>] ? print_circular_bug+0xfa/0x100 [<ffffffff810a7222>] ? __lock_acquire+0x1622/0x1d10 [<ffffffff810a1f99>] ? trace_hardirqs_off_caller+0x29/0xc0 [<ffffffff810a79d6>] ? lock_acquire+0xc6/0x280 [<ffffffff816b1266>] ? cfg80211_wext_siwfreq+0xc6/0x100 [<ffffffff810a31d7>] ? mark_held_locks+0x67/0x90 [<ffffffff816d6bce>] ? mutex_lock_nested+0x6e/0x4b0 [<ffffffff816b1266>] ? cfg80211_wext_siwfreq+0xc6/0x100 [<ffffffff810a31d7>] ? mark_held_locks+0x67/0x90 [<ffffffff816b1266>] ? cfg80211_wext_siwfreq+0xc6/0x100 [<ffffffff816b1266>] ? cfg80211_wext_siwfreq+0xc6/0x100 [<ffffffff816b2fad>] ? ioctl_standard_call+0x5d/0xd0 [<ffffffff8157818b>] ? __dev_get_by_name+0x9b/0xc0 [<ffffffff816b2f50>] ? ioctl_standard_call+0x0/0xd0 [<ffffffff816b3223>] ? T.808+0x163/0x170 [<ffffffff8112ddf2>] ? might_fault+0x72/0xd0 [<ffffffff816b326a>] ? wext_handle_ioctl+0x3a/0x90 [<ffffffff8112de3b>] ? might_fault+0xbb/0xd0 [<ffffffff815798d2>] ? dev_ioctl+0x6f2/0x830 [<ffffffff810a1bae>] ? put_lock_stats+0xe/0x40 [<ffffffff810a1c8c>] ? lock_release_holdtime+0xac/0x150 [<ffffffff8155cf3d>] ? sock_ioctl+0xfd/0x290 [<ffffffff8117dffd>] ? do_vfs_ioctl+0x9d/0x590 [<ffffffff8116c8ff>] ? fget_light+0x1df/0x3c0 [<ffffffff8117e53a>] ? sys_ioctl+0x4a/0x80 [<ffffffff81003192>] ? system_call_fastpath+0x16/0x1b Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-02-21 15:14:25 -05:00
Yuchung Cheng	c24f691b56	tcp: undo_retrans counter fixes Fix a bug that undo_retrans is incorrectly decremented when undo_marker is not set or undo_retrans is already 0. This happens when sender receives more DSACK ACKs than packets retransmitted during the current undo phase. This may also happen when sender receives DSACK after the undo operation is completed or cancelled. Fix another bug that undo_retrans is incorrectly incremented when sender retransmits an skb and tcp_skb_pcount(skb) > 1 (TSO). This case is rare but not impossible. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-21 11:31:18 -08:00
Eric W. Biederman	5f04d5068a	net: Fix more stale on-stack list_head objects. From: Eric W. Biederman <ebiederm@xmission.com> In the beginning with batching unreg_list was a list that was used only once in the lifetime of a network device (I think). Now we have calls using the unreg_list that can happen multiple times in the life of a network device like dev_deactivate and dev_close that are also using the unreg_list. In addition in unregister_netdevice_queue we also do a list_move because for devices like veth pairs it is possible that unregister_netdevice_queue will be called multiple times. So I think the change below to fix dev_deactivate which Eric D. missed will fix this problem. Now to go test that. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-20 11:49:45 -08:00
Jiri Bohac	2205a6ea93	sctp: fix reporting of unknown parameters commit `5fa782c2f5` re-worked the handling of unknown parameters. sctp_init_cause_fixed() can now return -ENOSPC if there is not enough tailroom in the error chunk skb. When this happens, the error header is not appended to the error chunk. In that case, the payload of the unknown parameter should not be appended either. Signed-off-by: Jiri Bohac <jbohac@suse.cz> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-19 19:06:55 -08:00
Eric Dumazet	91035f0b7d	tcp: fix inet_twsk_deschedule() Eric W. Biederman reported a lockdep splat in inet_twsk_deschedule() This is caused by inet_twsk_purge(), run from process context, and commit `575f4cd5a5` (net: Use rcu lookups in inet_twsk_purge.) removed the BH disabling that was necessary. Add the BH disabling but fine grained, right before calling inet_twsk_deschedule(), instead of whole function. With help from Linus Torvalds and Eric W. Biederman Reported-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Daniel Lezcano <daniel.lezcano@free.fr> CC: Pavel Emelyanov <xemul@openvz.org> CC: Arnaldo Carvalho de Melo <acme@redhat.com> CC: stable <stable@kernel.org> (# 2.6.33+) Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-19 18:59:04 -08:00
David S. Miller	ece639caa3	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6	2011-02-19 16:42:37 -08:00
Linus Torvalds	4c3021da45	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (37 commits) net: deinit automatic LIST_HEAD net: dont leave active on stack LIST_HEAD net: provide default_advmss() methods to blackhole dst_ops tg3: Restrict phy ioctl access drivers/net: Call netif_carrier_off at the end of the probe ixgbe: work around for DDP last buffer size ixgbe: fix panic due to uninitialised pointer e1000e: flush all writebacks before unload e1000e: check down flag in tasks isdn: hisax: Use l2headersize() instead of dup (and buggy) func. arp_notify: unconditionally send gratuitous ARP for NETDEV_NOTIFY_PEERS. cxgb4vf: Use defined Mailbox Timeout cxgb4vf: Quiesce Virtual Interfaces on shutdown ... cxgb4vf: Behave properly when CONFIG_DEBUG_FS isn't defined ... cxgb4vf: Check driver parameters in the right place ... pch_gbe: Fix the MAC Address load issue. iwlwifi: Delete iwl3945_good_plcp_health. net/can/softing: make CAN_SOFTING_CS depend on CAN_SOFTING netfilter: nf_iterate: fix incorrect RCU usage pch_gbe: Fix the issue that the receiving data is not normal. ...	2011-02-18 14:15:05 -08:00
Stanislaw Gruszka	05e7c99136	mac80211: fix conn_mon_timer running after disassociate Low level driver could pass rx frames to us after disassociate, what can lead to run conn_mon_timer by ieee80211_sta_rx_notify(). That is obviously wrong, but nothing happens until we unload modules and resources are used after free. If kernel debugging is enabled following warning could be observed: WARNING: at lib/debugobjects.c:259 debug_print_object+0x65/0x70() Hardware name: HP xw8600 Workstation ODEBUG: free active (active state 0) object type: timer_list Modules linked in: iwlagn(-) iwlcore mac80211 cfg80211 aes_x86_64 aes_generic fuse cpufreq_ondemand acpi_cpufreq freq_table mperf xt_physdev ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 ext3 jbd dm_mirror dm_region_hash dm_log dm_mod uinput hp_wmi sparse_keymap sg wmi arc4 microcode serio_raw ecb tg3 shpchp rfkill ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif firewire_ohci firewire_core crc_itu_t mptsas mptscsih mptbase scsi_transport_sas ahci libahci pata_acpi ata_generic ata_piix floppy nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: cfg80211] Pid: 13827, comm: rmmod Tainted: G W 2.6.38-rc4-wl+ #22 Call Trace: [<ffffffff810649cf>] ? warn_slowpath_common+0x7f/0xc0 [<ffffffff81064ac6>] ? warn_slowpath_fmt+0x46/0x50 [<ffffffff81226fc5>] ? debug_print_object+0x65/0x70 [<ffffffff81227625>] ? debug_check_no_obj_freed+0x125/0x210 [<ffffffff8109ebd7>] ? debug_check_no_locks_freed+0xf7/0x170 [<ffffffff81156092>] ? kfree+0xc2/0x2f0 [<ffffffff813ec5c5>] ? netdev_release+0x45/0x60 [<ffffffff812f1067>] ? device_release+0x27/0xa0 [<ffffffff81216ddd>] ? kobject_release+0x8d/0x1a0 [<ffffffff81216d50>] ? kobject_release+0x0/0x1a0 [<ffffffff812183b7>] ? kref_put+0x37/0x70 [<ffffffff81216c57>] ? kobject_put+0x27/0x60 [<ffffffff813d5d1b>] ? netdev_run_todo+0x1ab/0x270 [<ffffffff813e771e>] ? rtnl_unlock+0xe/0x10 [<ffffffffa0581188>] ? ieee80211_unregister_hw+0x58/0x120 [mac80211] [<ffffffffa0377ed7>] ? iwl_pci_remove+0xdb/0x22a [iwlagn] [<ffffffff8123cde2>] ? pci_device_remove+0x52/0x120 [<ffffffff812f5205>] ? __device_release_driver+0x75/0xe0 [<ffffffff812f5348>] ? driver_detach+0xd8/0xe0 [<ffffffff812f4111>] ? bus_remove_driver+0x91/0x100 [<ffffffff812f5b62>] ? driver_unregister+0x62/0xa0 [<ffffffff8123d194>] ? pci_unregister_driver+0x44/0xa0 [<ffffffffa0377df5>] ? iwl_exit+0x15/0x1c [iwlagn] [<ffffffff810ab492>] ? sys_delete_module+0x1a2/0x270 [<ffffffff81498889>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff8100bf42>] ? system_call_fastpath+0x16/0x1b Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-02-18 16:47:37 -05:00
Eric Dumazet	ceaaec98ad	net: deinit automatic LIST_HEAD commit `9b5e383c11` (net: Introduce unregister_netdevice_many()) left an active LIST_HEAD() in rollback_registered(), with possible memory corruption. Even if device is freed without touching its unreg_list (and therefore touching the previous memory location holding LISTE_HEAD(single), better close the bug for good, since its really subtle. (Same fix for default_device_exit_batch() for completeness) Reported-by: Michal Hocko <mhocko@suse.cz> Tested-by: Michal Hocko <mhocko@suse.cz> Reported-by: Eric W. Biderman <ebiderman@xmission.com> Tested-by: Eric W. Biderman <ebiderman@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Ingo Molnar <mingo@elte.hu> CC: Octavian Purdila <opurdila@ixiacom.com> CC: stable <stable@kernel.org> [.33+] Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-18 11:49:36 -08:00
Linus Torvalds	f87e6f4793	net: dont leave active on stack LIST_HEAD Eric W. Biderman and Michal Hocko reported various memory corruptions that we suspected to be related to a LIST head located on stack, that was manipulated after thread left function frame (and eventually exited, so its stack was freed and reused). Eric Dumazet suggested the problem was probably coming from commit `443457242b` (net: factorize sync-rcu call in unregister_netdevice_many) This patch fixes __dev_close() and dev_close() to properly deinit their respective LIST_HEAD(single) before exiting. References: https://lkml.org/lkml/2011/2/16/304 References: https://lkml.org/lkml/2011/2/14/223 Reported-by: Michal Hocko <mhocko@suse.cz> Tested-by: Michal Hocko <mhocko@suse.cz> Reported-by: Eric W. Biderman <ebiderman@xmission.com> Tested-by: Eric W. Biderman <ebiderman@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Ingo Molnar <mingo@elte.hu> CC: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-18 11:49:35 -08:00
Eric Dumazet	214f45c91b	net: provide default_advmss() methods to blackhole dst_ops Commit `0dbaee3b37` (net: Abstract default ADVMSS behind an accessor.) introduced a possible crash in tcp_connect_init(), when dst->default_advmss() is called from dst_metric_advmss() Reported-by: George Spelvin <linux@horizon.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-18 11:39:01 -08:00
Joerg Marx	0af320fb46	netfilter: ip6t_LOG: fix a flaw in printing the MAC The flaw was in skipping the second byte in MAC header due to increasing the pointer AND indexed access starting at '1'. Signed-off-by: Joerg Marx <joerg.marx@secunet.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2011-02-17 16:23:40 +01:00
Florian Westphal	d503b30bd6	netfilter: tproxy: do not assign timewait sockets to skb->sk Assigning a socket in timewait state to skb->sk can trigger kernel oops, e.g. in nfnetlink_log, which does: if (skb->sk) { read_lock_bh(&skb->sk->sk_callback_lock); if (skb->sk->sk_socket && skb->sk->sk_socket->file) ... in the timewait case, accessing sk->sk_callback_lock and sk->sk_socket is invalid. Either all of these spots will need to add a test for sk->sk_state != TCP_TIME_WAIT, or xt_TPROXY must not assign a timewait socket to skb->sk. This does the latter. If a TW socket is found, assign the tproxy nfmark, but skip the skb->sk assignment, thus mimicking behaviour of a '-m socket .. -j MARK/ACCEPT' re-routing rule. The 'SYN to TW socket' case is left unchanged -- we try to redirect to the listener socket. Cc: Balazs Scheidler <bazsi@balabit.hu> Cc: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Florian Westphal <fwestphal@astaro.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2011-02-17 11:32:38 +01:00
Vladislav P	840af824b2	Bluetooth: Release BTM while sleeping to avoid deadlock Signed-off-by: Vladislav P <vladisslav@inbox.ru> Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>	2011-02-16 15:54:11 -03:00
Ian Campbell	d11327ad66	arp_notify: unconditionally send gratuitous ARP for NETDEV_NOTIFY_PEERS. NETDEV_NOTIFY_PEER is an explicit request by the driver to send a link notification while NETDEV_UP/NETDEV_CHANGEADDR generate link notifications as a sort of side effect. In the later cases the sysctl option is present because link notification events can have undesired effects e.g. if the link is flapping. I don't think this applies in the case of an explicit request from a driver. This patch makes NETDEV_NOTIFY_PEER unconditional, if preferred we could add a new sysctl for this case which defaults to on. This change causes Xen post-migration ARP notifications (which cause switches to relearn their MAC tables etc) to be sent by default. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-14 17:47:15 -08:00
David S. Miller	8bc26a008f	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6	2011-02-14 12:51:42 -08:00
David S. Miller	af756e9d88	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6	2011-02-14 11:16:12 -08:00
Patrick McHardy	de9963f0f2	netfilter: nf_iterate: fix incorrect RCU usage As noticed by Eric, nf_iterate doesn't use RCU correctly by accessing the prev pointer of a RCU protected list element when a verdict of NF_REPEAT is issued. Fix by jumping backwards to the hook invocation directly instead of loading the previous list element before continuing the list iteration. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2011-02-14 17:35:07 +01:00
Jesper Juhl	d3337de52a	Don't potentially dereference NULL in net/dcb/dcbnl.c:dcbnl_getapp() nla_nest_start() may return NULL. If it does then we'll blow up in nla_nest_end() when we dereference the pointer. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-13 11:21:14 -08:00
John Fastabend	7ec79270d7	net: dcb: application priority is per net_device The app_data priority may not be the same for all net devices. In order for stacks with application notifiers to identify the specific net device dcb_app_type should be passed in the ptr. This allows handlers to use dev_get_by_name() to pin priority to net devices. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-13 11:02:39 -08:00
Herbert Xu	8a870178c0	bridge: Replace mp->mglist hlist with a bool As it turns out we never need to walk through the list of multicast groups subscribed by the bridge interface itself (the only time we'd want to do that is when we shut down the bridge, in which case we simply walk through all multicast groups), we don't really need to keep an hlist for mp->mglist. This means that we can replace it with just a single bit to indicate whether the bridge interface is subscribed to a group. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-12 01:05:42 -08:00

1 2 3 4 5 ...

17860 Commits