kernel_optimize_test

Author	SHA1	Message	Date
Mike Christie	b6c395ed03	[SCSI] iscsi bugfixes: fix r2t handling The iscsi tcp code can pluck multiple rt2s from the tasks's r2tqueue in the xmit code. This can result in the task being queued on the xmit queue but gettting completed at the same time. This patch fixes the above bug by making the fifo a list so we always remove the entry on the list del. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2006-07-28 11:47:40 -05:00
Mike Christie	d82967c706	[SCSI] iscsi bugfixes: send correct error values to userspace In the xmit patch we are sending a -EXXX value to iscsi_conn_failure which is causing userspace to get confused. We should be sending a ISCSI_ERR_* value that userspace understands. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2006-07-28 11:47:23 -05:00
HighPoint Linux Team	8d4fbd3f97	[SCSI] hptiop: wrong register used in hptiop_reset_hba() IOP reset message should be posted to inbound message register instead of outbound message register. Signed-off-by: HighPoint Linux Team <linux@highpoint-tech.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2006-07-28 11:47:02 -05:00
Grant Grundler	b2b3c12107	[SCSI] sym2: claim only "Storage" class The follow patch fixes a problem for Matt Taggart. The Compaq system he had (dl380?) has a SmartArray device that exposes the 53c1510 device in both RAID and "normal" modes. The difference is in RAID mode, the smart array driver (IIRC) should claim the device instead of sym2 driver. Patch below prevents sym2 from claiming the device when the RAID "daughter board" is attached. Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2006-07-28 11:46:38 -05:00
Christoph Hellwig	64821324ca	[PATCH] fix compile regression for a few scsi drivers This fixes three drivers to compile again after my patch that removes the data_cmnd member from struct scsi_cmnd. The fas216 change is trivial, it should have been using ->cmnd all the time. NCR53C9 (which seem to be mostly duplicate driver with esp.c!) is doing something odd, it should only have looked at ->cmnd before not the saved copy that is kept for the error handlers sake. Note that it really should deal with the sync setting themselves but use the generic domain validation code that get this right - but that's for later let's push this simple compile fix for now. And sorry for the late fix for this, I have been busy with OLS and associated activities last week. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-26 07:30:45 -07:00
Linus Torvalds	dab5025ca2	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SCSI] esp: Fix build. [SPARC]: Fix SA_STATIC_ALLOC value. [SPARC64]: Explicitly print return PC when the kernel fault PC is bogus.	2006-07-26 07:22:36 -07:00
Linus Torvalds	761a126017	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [IPV4/IPV6]: Setting 0 for unused port field in RAW IP recvmsg(). [IPV4] ipmr: ip multicast route bug fix. [TG3]: Update version and reldate [TG3]: Handle tg3_init_rings() failures [TG3]: Add tg3_restart_hw() [IPV4]: Clear the whole IPCB, this clears also IPCB(skb)->flags. [IPV6]: Clean skb cb on IPv6 input. [NETFILTER]: Demote xt_sctp to EXPERIMENTAL [NETFILTER]: bridge netfilter: add deferred output hooks to feature-removal-schedule [NETFILTER]: xt_pkttype: fix mismatches on locally generated packets [NETFILTER]: SNMP NAT: fix byteorder confusion [NETFILTER]: conntrack: fix SYSCTL=n compile [NETFILTER]: nf_queue: handle NF_STOP and unknown verdicts in nf_reinject [NETFILTER]: H.323 helper: fix possible NULL-ptr dereference	2006-07-26 07:22:10 -07:00
Arjan van de Ven	153d7f3fca	[PATCH] Reorganize the cpufreq cpu hotplug locking to not be totally bizare The patch below moves the cpu hotplugging higher up in the cpufreq layering; this is needed to avoid recursive taking of the cpu hotplug lock and to otherwise detangle the mess. The new rules are: 1. you must do lock_cpu_hotplug() around the following functions: __cpufreq_driver_target __cpufreq_governor (for CPUFREQ_GOV_LIMITS operation only) __cpufreq_set_policy 2. governer methods (.governer) must NOT take the lock_cpu_hotplug() lock in any way; they are called with the lock taken already 3. if your governer spawns a thread that does things, like calling __cpufreq_driver_target, your thread must honor rule #1. 4. the policy lock and other cpufreq internal locks nest within the lock_cpu_hotplug() lock. I'm not entirely happy about how the __cpufreq_governor rule ended up (conditional locking rule depending on the argument) but basically all callers pass this as a constant so it's not too horrible. The patch also removes the cpufreq_governor() function since during the locking audit it turned out to be entirely unused (so no need to fix it) The patch works on my testbox, but it could use more testing (otoh... it can't be much worse than the current code) Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-26 07:21:40 -07:00
Tetsuo Handa	f59fc7f30b	[IPV4/IPV6]: Setting 0 for unused port field in RAW IP recvmsg(). From: Tetsuo Handa from-linux-kernel@i-love.sakura.ne.jp The recvmsg() for raw socket seems to return random u16 value from the kernel stack memory since port field is not initialized. But I'm not sure this patch is correct. Does raw socket return any information stored in port field? [ BSD defines RAW IP recvmsg to return a sin_port value of zero. This is described in Steven's TCP/IP Illustrated Volume 2 on page 1055, which is discussing the BSD rip_input() implementation. ] Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-25 17:05:35 -07:00
Alexey Kuznetsov	7228749092	[IPV4] ipmr: ip multicast route bug fix. IP multicast route code was reusing an skb which causes use after free and double free. From: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Note, it is real skb_clone(), not alloc_skb(). Equeued skb contains the whole half-prepared netlink message plus room for the rest. It could be also skb_copy(), if we want to be puristic about mangling cloned data, but original copy is really not going to be used. Acked-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-25 16:45:12 -07:00
Michael Chan	b6e77a5346	[TG3]: Update version and reldate Update version to 3.63. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-25 16:39:12 -07:00
Michael Chan	32d8c5724b	[TG3]: Handle tg3_init_rings() failures Handle dev_alloc_skb() failures when initializing the RX rings. Without proper handling, the driver will crash when using a partial ring. Thanks to Stephane Doyon <sdoyon@max-t.com> for reporting the bug and providing the initial patch. Howie Xu <howie@vmware.com> also reported the same issue. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-25 16:38:29 -07:00
Michael Chan	b9ec6c1b91	[TG3]: Add tg3_restart_hw() Add tg3_restart_hw() to handle failures when re-initializing the device. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-25 16:37:27 -07:00
Jens Axboe	44eb123126	[PATCH] cfq-iosched: don't use a hard jiffies value, translate from msecs The CIC_SEEKY() test really wants to use the minimum of either: - 2 msecs (not jiffies) - or, the pending slice time So code it like that. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-07-25 15:05:21 +02:00
Milton Miller	ad01b1ca79	[PATCH] blktrace: fix read-ahead bit It should be toggling the same bit on and off, fix it up. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-07-25 15:04:13 +02:00
Jens Axboe	7b30f09245	[PATCH] cciss: fix stall with softirq handling and CFQ We need to postpone the queue startup until after the softirq handler has actually finished some requests, otherwise we could be racing with cciss_softirq_done() and not actually restart the queue handling. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-07-25 15:02:48 +02:00
Guillaume Chazarain	d569f1d72f	[IPV4]: Clear the whole IPCB, this clears also IPCB(skb)->flags. Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 23:45:16 -07:00
Guillaume Chazarain	6b7fdc3ae1	[IPV6]: Clean skb cb on IPv6 input. Clear the accumulated junk in IP6CB when starting to handle an IPV6 packet. Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 23:44:44 -07:00
Patrick McHardy	d5af981e93	[NETFILTER]: Demote xt_sctp to EXPERIMENTAL After the recent problems with all the SCTP stuff it seems reasonable to mark this as experimental. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 22:55:29 -07:00
Patrick McHardy	10ea6ac895	[NETFILTER]: bridge netfilter: add deferred output hooks to feature-removal-schedule Add bridge netfilter deferred output hooks to feature-removal-schedule and disable them by default. Until their removal they will be activated by the physdev match when needed. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 22:54:55 -07:00
Phil Oester	28658c8967	[NETFILTER]: xt_pkttype: fix mismatches on locally generated packets Locally generated broadcast and multicast packets have pkttype set to PACKET_LOOPBACK instead of PACKET_BROADCAST or PACKET_MULTICAST. This causes the pkttype match to fail to match packets of either type. The below patch remedies this by using the daddr as a hint as to broadcast\|multicast. While not pretty, this seems like the only way to solve the problem short of just noting this as a limitation of the match. This resolves netfilter bugzilla #484 Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 22:54:14 -07:00
Patrick McHardy	8cf8fb5687	[NETFILTER]: SNMP NAT: fix byteorder confusion Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 22:53:35 -07:00
Adrian Bunk	72b5582359	[NETFILTER]: conntrack: fix SYSCTL=n compile Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 22:53:12 -07:00
Patrick McHardy	3bc38712e3	[NETFILTER]: nf_queue: handle NF_STOP and unknown verdicts in nf_reinject In case of an unknown verdict or NF_STOP the packet leaks. Unknown verdicts can happen when userspace is buggy. Reinject the packet in case of NF_STOP, drop on unknown verdicts. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 22:52:47 -07:00
Patrick McHardy	083edca05a	[NETFILTER]: H.323 helper: fix possible NULL-ptr dereference An RCF message containing a timeout results in a NULL-ptr dereference if no RRQ has been seen before. Noticed by the "SATURN tool", reported by Thomas Dillig <tdillig@stanford.edu> and Isil Dillig <isil@stanford.edu>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 22:52:10 -07:00
David S. Miller	6bc063d414	[SCSI] esp: Fix build. The data_cmd[] member got deleted, so do not use it any more. Scsi commands do not have their ->cmd[] overwritten temporary to probe for status after an error before retrying. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 22:47:14 -07:00
David S. Miller	29ed46015d	[SPARC]: Fix SA_STATIC_ALLOC value. It alises IRQF_SHARED which causes all kinds of problems. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 22:34:00 -07:00
David S. Miller	eb398d1044	[SPARC64]: Explicitly print return PC when the kernel fault PC is bogus. That way we'll have at least some debugging info even if the stack dump explodes. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 22:33:58 -07:00
Christoph Hellwig	b4e54de8d3	[NET]: Correct dev_alloc_skb kerneldoc dev_alloc_skb is designated for RX descriptors, not TX. (Some drivers use it for the latter anyway, but that's a different story) Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 15:31:14 -07:00
Christoph Hellwig	37182d1bd3	[NET]: Remove CONFIG_HAVE_ARCH_DEV_ALLOC_SKB skbuff.h has an #ifndef CONFIG_HAVE_ARCH_DEV_ALLOC_SKB to allow architectures to reimplement __dev_alloc_skb. It's not set on any architecture and now that we have an architecture-overrideable NET_SKB_PAD there is not point at all to have one either. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 15:30:28 -07:00
Stefan Rompf	6c753c3d3b	[VLAN]: Fix link state propagation When the queue of the underlying device is stopped at initialization time or the device is marked "not present", the state will be propagated to the vlan device and never change. Based on an analysis by Patrick McHardy. Signed-off-by: Stefan Rompf <stefan@loplof.de> ACKed-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 13:52:13 -07:00
David S. Miller	a922ba5510	[IPV6] xfrm6_tunnel: Delete debugging code. It doesn't compile, and it's dubious in several regards: 1) is enabled by non-Kconfig controlled CONFIG_* value (noted by Randy Dunlap) 2) XFRM6_TUNNEL_SPI_MAGIC is defined after it's first use 3) the debugging messages print object pointer addresses which have no meaning without context So let's just get rid of it. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 13:49:06 -07:00
Marcel Holtmann	e9e9290f5c	[Bluetooth] Enable SCO support for Broadcom HID proxy dongle The Broadcom dongles with HID proxy support actually support SCO over HCI if the SCO buffer size values are corrected. So instead of disabling the SCO support, mark this dongle with the quirk for the Bluetooth core to correct the wrong buffer size values. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2006-07-24 12:44:34 -07:00
Marcel Holtmann	8e4f7230a3	[Bluetooth] Add quirk for another broken RTX Telecom based dongle This patch disables the ISOC transfers for another broken RTX Telecom based USB dongle. Starting the USB ISOC transfers only ends in a burst of error messages for invalid SCO packets on connection handle 0. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2006-07-24 12:44:32 -07:00
Marcel Holtmann	ea9727f6e5	[Bluetooth] Correct SCO buffer size for Belkin devices The Belkin F8T012 and F8T013 devices are both based on a Bluetooth chip from Broadcom and their SCO buffer size values are wrong. The Bluetooth core should correct these values. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2006-07-24 12:44:30 -07:00
Marcel Holtmann	520ca78acc	[Bluetooth] Correct SCO buffer size for another Broadcom chip The SCO buffer size values on IBM/Lenovo ThinkPad laptops with a Bluetooth chip from Broadcom are wrong. The USB Bluetooth driver has to set a quirk to correct the SCO buffer size values. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2006-07-24 12:44:27 -07:00
Marcel Holtmann	98bcd08b5b	[Bluetooth] Correct RFCOMM channel MTU for broken implementations Some Bluetooth RFCOMM implementations try to negotiate a bigger channel MTU than we can support for a particular session. The maximum MTU for a RFCOMM session is limited through the L2CAP layer. So if the other side proposes a channel MTU that is bigger than the underlying L2CAP MTU, we should reduce it to the L2CAP MTU of the session minus five bytes for the RFCOMM headers. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2006-07-24 12:44:25 -07:00
Guillaume Chazarain	2266d8886f	[PKT_SCHED]: Fix regression in PSCHED_TADD{,2}. In PSCHED_TADD and PSCHED_TADD2, if delta is less than tv.tv_usec (so, less than USEC_PER_SEC too) then tv_res will be smaller than tv. The affectation "(tv_res).tv_usec = __delta;" is wrong. The fix is to revert to the original code before `4ee303dfea` and change the 'if' in 'while'. [Shuya MAEDA: "while (__delta >= USEC_PER_SEC){ ... }" instead of "while (__delta > USEC_PER_SEC){ ... }"] Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 12:44:23 -07:00
Ian McDonald	4b79f0af48	[DCCP]: Fix default sequence window size When using the default sequence window size (100) I got the following in my logs: Jun 22 14:24:09 localhost kernel: [ 1492.114775] DCCP: Step 6 failed for DATA packet, (LSWL(6279674225) <= P.seqno(6279674749) <= S.SWH(6279674324)) and (P.ackno doesn't exist or LAWL(18798206530) <= P.ackno(1125899906842620) <= S.AWH(18798206548), sending SYNC... Jun 22 14:24:09 localhost kernel: [ 1492.115147] DCCP: Step 6 failed for DATA packet, (LSWL(6279674225) <= P.seqno(6279674750) <= S.SWH(6279674324)) and (P.ackno doesn't exist or LAWL(18798206530) <= P.ackno(1125899906842620) <= S.AWH(18798206549), sending SYNC... I went to alter the default sysctl and it didn't take for new sockets. Below patch fixes this. I think the default is too low but it is what the DCCP spec specifies. As a side effect of this my rx speed using iperf goes from about 2.8 Mbits/sec to 3.5. This is still far too slow but it is a step in the right direction. Compile tested only for IPv6 but not particularly complex change. Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 12:44:21 -07:00
Roland Dreier	8fdf679fdb	IB/mthca: Initialize max_cmds before debug code prints it Read the max_cmds value from the response to the QUERY_FW command before printing out the value, so that the real value goes into the debug output. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-07-24 09:36:50 -07:00
Michael S. Tsirkin	8a7f752125	IB/ipoib: Fix packet loss after hardware address update The neighbour ha field may get updated without destroying the neighbour. In this case, the ha field gets out of sync with the address handle stored in ipoib_neigh->ah, with the result that the ah field would point to an incorrect path, resulting in all packets being lost. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-07-24 09:18:07 -07:00
Or Gerlitz	624d01f899	IB/ipoib: Fix oops with ipoib_debug_mcast set Need to set mcast->ah before debug code dereferences it. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-07-24 09:18:07 -07:00
Sean Hefty	2527e681fd	IB/mad: Validate MADs for spec compliance Validate MADs sent by userspace clients for spec compliance with C13-18.1.1 (prevent duplicate requests and responses sent on the same port). Without this, RMPP transactions get aborted because of duplicate packets. This patch is similar to that provided by Jack Morgenstein. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-07-24 09:18:07 -07:00
Ralph Campbell	16c59419a0	IB/ipath: ipath_skip_sge() can break if num_sge > 1 ipath_skip_sge() doesn't exactly duplicate the side effects of ipath_copy_sge() if num_sge > 1 since it doesn't decrement ss->num_sge. This could result in the sg_list being accessed out of bounds. Since ipath_skip_sge() is almost always called with num_sge == 1, the original "optimization" is almost never used. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-07-24 09:18:07 -07:00
Ralph Campbell	c9f79bdc21	IB/ipath: Fix ib_ipath driver to work with SRP I am still working on a proposal to remove the phys_to_virt() calls in the ib_ipath driver. In the mean time, this patch allows SRP to work by fixing the R_Key check and conversion from IB address to kernel virtual address. It also returns the correct page size for FMRs. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-07-24 09:18:07 -07:00
Ralph Campbell	3d37b9e209	IB/ipath: Fix a data corruption This patch fixes a problem where certain error packets are passed to the InfiniBand layer for processing even though the packet actually was received with an error. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-07-24 09:18:05 -07:00
Dotan Barak	1252c517cf	IB/mthca: Fix SRQ limit event range check Mem-free HCAs always keep one spare SRQ WQE, so the SRQ limit cannot be set beyond srq->max - 1. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-07-24 07:20:32 -07:00
Roland Dreier	43db2bc044	IB/uverbs: Fix lockdep warnings Lockdep warns because uverbs is trying to take uobj->mutex when it already holds that lock. This is because there are really multiple types of uobjs even though all of their locks are initialized in common code. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-07-23 15:16:04 -07:00
Michael S. Tsirkin	ec924b4726	IB/uverbs: Fix unlocking in error paths ib_uverbs_create_ah() and ib_uverbs_create_srq() did not release the PD's read lock in their error paths, which lead to deadlock when destroying the PD. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-07-23 15:16:03 -07:00
Paul Jackson	abb5a5cc6b	[PATCH] Cpuset: fix ABBA deadlock with cpu hotplug lock Fix ABBA deadlock between lock_cpu_hotplug() and the cpuset callback_mutex lock. It only happens on cpu_exclusive cpusets, due to the dynamic sched domain code trying to take the cpu hotplug lock inside the cpuset callback_mutex lock. This bug has apparently been here for several months, but didn't get hit until the right customer load on a large system. This fix appears right from inspection, but it will take a few more days running it on that customers workload to be confident we nailed it. We don't have any other reproducible test case. The cpu_hotplug_lock() tends to cover large runs of code. The other places that hold both that lock and the cpuset callback mutex lock always nest the cpuset lock inside the hotplug lock. This place tries to do the reverse, risking an ABBA deadlock. This is in the cpuset_rmdir() code, where we: * take the callback_mutex lock * mark the cpuset CS_REMOVED * call update_cpu_domains for cpu_exclusive cpusets * in that call, take the cpu_hotplug lock if the cpuset is marked for removal. Thanks to Jack Steiner for identifying this deadlock. The fix is to tear down the dynamic sched domain before we grab the cpuset callback_mutex lock. This way, the two locks are serialized, with the hotplug lock taken and released before trying for the cpuset lock. I suspect that this bug was introduced when I changed the cpuset locking from one lock to two. The dynamic sched domain dependency on cpu_exclusive cpusets and its hotplug hooks were added to this code earlier, when cpusets had only a single lock. It may well have been fine then. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-23 13:03:05 -07:00

1 2 3 4 5 ...

32848 Commits