kernel_optimize_test

Author	SHA1	Message	Date
Martin K. Petersen	98262f2762	block: Allow devices to indicate whether discarded blocks are zeroed The discard ioctl is used by mkfs utilities to clear a block device prior to putting metadata down. However, not all devices return zeroed blocks after a discard. Some drives return stale data, potentially containing old superblocks. It is therefore important to know whether discarded blocks are properly zeroed. Both ATA and SCSI drives have configuration bits that indicate whether zeroes are returned after a discard operation. Implement a block level interface that allows this information to be bubbled up the stack and queried via a new block device ioctl. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-12-03 09:24:48 +01:00
Jens Axboe	464191c65b	Revert "cfq: Make use of service count to estimate the rb_key offset" This reverts commit `3586e917f2`. Corrado Zoccolo <czoccolo@gmail.com> correctly points out, that we need consistency of rb_key offset across groups. This means we cannot properly use the per-service_tree service count. Revert this change. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-30 09:38:13 +01:00
Corrado Zoccolo	8e550632cc	cfq-iosched: fix corner cases in idling logic Idling logic was disabled in some corner cases, leading to unfair share for noidle queues. * the idle timer was not armed if there were other requests in the driver. unfortunately, those requests could come from other workloads, or queues for which we don't enable idling. So we will check only pending requests from the active queue * rq_noidle check on no-idle queue could disable the end of tree idle if the last completed request was rq_noidle. Now, we will disable that idle only if all the queues served in the no-idle tree had rq_noidle requests. Reported-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-26 10:39:31 +01:00
Corrado Zoccolo	76280aff1c	cfq-iosched: idling on deep seeky sync queues Seeky sync queues with large depth can gain unfairly big share of disk time, at the expense of other seeky queues. This patch ensures that idling will be enabled for queues with I/O depth at least 4, and small think time. The decision to enable idling is sticky, until an idle window times out without seeing a new request. The reasoning behind the decision is that, if an application is using large I/O depth, it is already optimized to make full utilization of the hardware, and therefore we reserve a slice of exclusive use for it. Reported-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-26 10:39:31 +01:00
Corrado Zoccolo	e4a229196a	cfq-iosched: fix no-idle preemption logic An incoming no-idle queue should preempt the active no-idle queue only if the active queue is idling due to service tree empty. Previous code was buggy in two ways: * it relied on service_tree field to be set on the active queue, while it is not set when the code is idling for a new request * it didn't check for the service tree empty condition, so could lead to LIFO behaviour if multiple queues with depth > 1 were preempting each other on an non-NCQ device. Reported-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-26 10:39:31 +01:00
Corrado Zoccolo	e459dd08f4	cfq-iosched: fix ncq detection code CFQ's detection of queueing devices initially assumes a queuing device and detects if the queue depth reaches a certain threshold. However, it will reconsider this choice periodically. Unfortunately, if device is considered not queuing, CFQ will force a unit queue depth for some workloads, thus defeating the detection logic. This leads to poor performance on queuing hardware, since the idle window remains enabled. Given this premise, switching to hw_tag = 0 after we have proved at least once that the device is NCQ capable is not a good choice. The new detection code starts in an indeterminate state, in which CFQ behaves as if hw_tag = 1, and then, if for a long observation period we never saw large depth, we switch to hw_tag = 0, otherwise we stick to hw_tag = 1, without reconsidering it again. Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-26 10:02:57 +01:00
Jens Axboe	75e7b63430	Merge branch 'for-jens' of git://git.drbd.org/linux-2.6-drbd into for-2.6.33	2009-11-26 09:46:51 +01:00
Vivek Goyal	d9449ce35a	Fix regression in direct writes performance due to WRITE_ODIRECT flag removal There seems to be a regression in direct write path due to following commit in for-2.6.33 branch of block tree. commit `1af60fbd75` Author: Jeff Moyer <jmoyer@redhat.com> Date: Fri Oct 2 18:56:53 2009 -0400 block: get rid of the WRITE_ODIRECT flag Marking direct writes as WRITE_SYNC_PLUG instead of WRITE_ODIRECT, sets the NOIDLE flag in bio and hence in request. This tells CFQ to not expect more request from the queue and not idle on it (despite the fact that queue's think time is less and it is not seeky). So direct writers lose big time when competing with sequential readers. Using fio, I have run one direct writer and two sequential readers and following are the results with 2.6.32-rc7 kernel and with for-2.6.33 branch. Test ==== 1 direct writer and 2 sequential reader running simultaneously. [global] directory=/mnt/sdc/fio/ runtime=10 [seqwrite] rw=write size=4G direct=1 [seqread] rw=read size=2G numjobs=2 2.6.32-rc7 ========== direct writes: aggrb=2,968KB/s readers : aggrb=101MB/s for-2.6.33 branch ================= direct write: aggrb=19KB/s readers aggrb=137MB/s This patch brings back the WRITE_ODIRECT flag, with the difference that we don't set the BIO_RW_UNPLUG flag so that device is not unplugged after submission of request and an explicit unplug from submitter is required. That way we fix the jeff's issue of not enough merging taking place in aio path as well as make sure direct writes get their fair share. After the fix ============= for-2.6.33 + fix ---------------- direct writes: aggrb=2,728KB/s reads: aggrb=103MB/s Thanks Vivek Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-26 09:46:46 +01:00
Corrado Zoccolo	c16632bab1	cfq-iosched: cleanup unreachable code cfq_should_idle returns false for no-idle queues that are not the last, so the control flow will never reach the removed code in a state that satisfies the if condition. The unreachable code was added to emulate previous cfq behaviour for non-NCQ rotational devices. My tests show that even without it, the performances and fairness are comparable with previous cfq, thanks to the fact that all seeky queues are grouped together, and that we idle at the end of the tree. Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-26 09:46:46 +01:00
Ilya Loginov	2d4dc890b5	block: add helpers to run flush_dcache_page() against a bio and a request's pages Mtdblock driver doesn't call flush_dcache_page for pages in request. So, this causes problems on architectures where the icache doesn't fill from the dcache or with dcache aliases. The patch fixes this. The ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE symbol was introduced to avoid pointless empty cache-thrashing loops on architectures for which flush_dcache_page() is a no-op. Every architecture was provided with this flush pages on architectires where ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE is equal 1 or do nothing otherwise. See "fix mtd_blkdevs problem with caches on some architectures" discussion on LKML for more information. Signed-off-by: Ilya Loginov <isloginov@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Peter Horton <phorton@bitbox.co.uk> Cc: "Ed L. Cashin" <ecashin@coraid.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-26 09:16:19 +01:00
Gui Jianfeng	3586e917f2	cfq: Make use of service count to estimate the rb_key offset For the moment, different workload cfq queues are put into different service trees. But CFQ still uses "busy_queues" to estimate rb_key offset when inserting a cfq queue into a service tree. I think this isn't appropriate, and it should make use of service tree count to do this estimation. This patch is for for-2.6.33 branch. Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-26 09:14:11 +01:00
Philipp Reisner	35a8a3fdcd	drbd: moved CN_IDX_DRBD and CN_VAL_DRBD to the right file Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2009-11-25 17:57:36 +01:00
Philipp Reisner	ad85dfe67b	DRBD: Now the code is 8.3.6 + 3 fixes (without compat crap) Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2009-11-24 18:20:08 +01:00
Philipp Reisner	d8c2a36b77	Fixed a regression in resync decission code drbd_uuid_compare() [Bugz 260] Since 8.3.3 we fail to do the resync when a partial resynch is not possible, but a full synch is necessary. This regression was introduced with 7101539930c0a89146959e7a39c09ad9c3516434 Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2009-11-24 18:13:28 +01:00
Lars Ellenberg	0b33a9164a	add missing state change on corrupt packet header in drbd_recv_header Otherwise the 'state fixup' in the receiver will change to Unconnected, but the receiver will terminate itself, and any attempt at 'down'ing that drbd later will block forever. see also Bugz. #259 Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2009-11-24 18:12:13 +01:00
Lars Ellenberg	6c6c7951be	fix in-kernel configuration serialization this is uncritical, as we still also serialize in userland, but to correctly serialize on the CONFIG_PENDING bit, it must be wait_event(state_wait, \!test_and_set_bit) Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2009-11-24 18:11:05 +01:00
Alex Chiang	32a87c0114	cciss: change Cmd_sg_list.sg_chain_dma type to dma_addr_t A recent commit broke the ia64 build: Author: Don Brace <brace@beardog.cce.hp.com> Date: Thu Nov 12 12:50:01 2009 -0600 cciss: Add enhanced scatter-gather support. because of this hunk: --- a/drivers/block/cciss.h +++ b/drivers/block/cciss.h +struct Cmd_sg_list { + SGDescriptor_struct *sgchain; + dma64_addr_t sg_chain_dma; + int chain_block_size; +}; The issue is that dma64_addr_t isn't #define'd on ia64. The way that we're using Cmd_sg_list.sg_chain_dma is to hold an address returned from pci_map_single(). + temp64.val = pci_map_single(h->pdev, + h->cmd_sg_list[c->cmdindex]->sgchain, + len, dir); + + h->cmd_sg_list[c->cmdindex]->sg_chain_dma = temp64.val; pci_map_single() returns a dma_addr_t too. This code will still work even on a 32-bit x86 build, where dma_addr_t is defined to be a u32 because it will simply be promoted to the __u64 that temp64.val is defined as. Thus, declaring Cmd_sg_list.sg_chain_dma as dma_addr_t is safe. Cc: Don Brace <brace@beardog.cce.hp.com> Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-23 09:35:06 +01:00
Stephen M. Cameron	d61c42690c	cciss: fix scatter gather cleanup problems On driver unload, only free up the extra scatter gather data if they were allocated in the first place (the controller supports it) and don't forget to free up the sg_cmd_list array of pointers. Signed-off-by: Don Brace <brace@beardog.cce.hp.com> Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-23 09:31:48 +01:00
Karel Zak	87038c2d5b	partitions: read whole sector with EFI GPT header The size of EFI GPT header is not static, but whole sector is allocated for the header. The HeaderSize field must be greater than 92 (= sizeof(struct gpt_header) and must be less than or equal to the logical block size. It means we have to read whole sector with the header, because the header crc32 checksum is calculated according to HeaderSize. For more details see UEFI standard (version 2.3, May 2009): - 5.3.1 GUID Format overview, page 93 - Table 13. GUID Partition Table Header, page 96 Signed-off-by: Karel Zak <kzak@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-23 09:29:58 +01:00
Karel Zak	7d13af3279	partitions: use sector size for EFI GPT Currently, kernel uses strictly 512-byte sectors for EFI GPT parsing. That's wrong. UEFI standard (version 2.3, May 2009, 5.3.1 GUID Format overview, page 95) defines that LBA is always based on the logical block size. It means bdev_logical_block_size() (aka BLKSSZGET) for Linux. This patch removes static sector size from EFI GPT parser. The problem is reproducible with the latest GNU Parted: # modprobe scsi_debug dev_size_mb=50 sector_size=4096 # ./parted /dev/sdb print Model: Linux scsi_debug (scsi) Disk /dev/sdb: 52.4MB Sector size (logical/physical): 4096B/4096B Partition Table: gpt Number Start End Size File system Name Flags 1 24.6kB 3002kB 2978kB primary 2 3002kB 6001kB 2998kB primary 3 6001kB 9003kB 3002kB primary # blockdev --rereadpt /dev/sdb # dmesg \| tail -1 sdb: unknown partition table <---- !!! with this patch: # blockdev --rereadpt /dev/sdb # dmesg \| tail -1 sdb: sdb1 sdb2 sdb3 Signed-off-by: Karel Zak <kzak@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-23 09:29:13 +01:00
Stephen M. Cameron	8721c81f64	cciss: Fix weird usage of ENXIO in cciss_scsi.c cciss: Fix weird usage of ENXIO in cciss_scsi.c Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-13 08:45:54 +01:00
Don Brace	5c07a311a8	cciss: Add enhanced scatter-gather support. cciss: Add enhanced scatter-gather support. For controllers which supported, more than 512 scatter-gather elements per command may be used, and the max transfer size can be increased to 8192 blocks. Signed-off-by: Don Brace <brace@beardog.cce.hp.com> Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-13 08:45:54 +01:00
Stephen M. Cameron	da0021841c	cciss: Do not automatically rescan on UNIT ATTENTION/LUN DATA CHANGED cciss: Do not automatically rescan on UNIT ATTENTION/LUN DATA CHANGED There are problems with doing this. If, say, several logical drives are deleted at once, several such UNIT ATTENTIONS will be encountered, often during the rescan triggered by the first such UNIT ATTENTION. The block layer may be in the midst of trying to add logical drives which were just deleted (resulting in the subsequent UNIT ATTENTION(s).) Making the rescan code robust enough to tolerate this kind of thing is too complicated for the moment. So, for now, we just don't do it. Note: This UNIT ATTENTION/LUN DATA CHANGED situation only occurs on the MSA2012. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-13 08:45:53 +01:00
Stephen M. Cameron	d06dfbd236	cciss: Remove unnecessary check in scan_thread cciss: Remove unnecessary check in scan_thread Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-13 08:45:53 +01:00
Stephen M. Cameron	b0e15f6db1	cciss: fix typo that causes scsi status to be lost. cciss: fix typo that causes scsi status to be lost. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-13 08:45:53 +01:00
Stephen M. Cameron	aa43f11147	cciss: remove sendcmd() as it is no longer used. cciss: remove sendcmd() as it is no longer used. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-13 08:45:53 +01:00
Stephen M. Cameron	29009a036f	cciss: clean up code in cciss_shutdown cciss: clean up code in cciss_shutdown. Send the flush cache command down with interrupts still enabled, and do not do DMA from the stack. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-13 08:45:53 +01:00
Stephen M. Cameron	7b838bde92	cciss: Remove the "withirq" parameter from various functions where possible cciss: Remove the "withirq" parameter from various functions where possible Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-13 08:45:53 +01:00
Stephen M. Cameron	c08fac6500	cciss: Retry driver initiated cmds with unit attention condition cciss: Retry driver initiated cmds with unit attention condition Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-13 08:45:53 +01:00
Stephen M. Cameron	fd8489cff4	cciss: Fix problem with remove_from_scan_list on driver unload cciss: Fix problem with remove_from_scan_list that on driver unload it doesn't remove the controller from the scan list correctly if the controller is currently being scanned for new devices. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-13 08:45:53 +01:00
Alex Chiang	8ba95c69fe	cciss: Make device attributes static cciss: Make device attributes static Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Alex Chiang <achiang@hp.com> Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-13 08:45:52 +01:00
Randy Dunlap	ad5ebd2fa2	block: jiffies fixes Use HZ-independent calculation of milliseconds. Add jiffies.h where it was missing since functions or macros from it are used. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-11 13:47:45 +01:00
Martin K. Petersen	86b3728141	block: Expose discard granularity While SSDs track block usage on a per-sector basis, RAID arrays often have allocation blocks that are bigger. Allow the discard granularity and alignment to be set and teach the topology stacking logic how to handle them. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-10 11:50:21 +01:00
Corrado Zoccolo	cf7c25cf91	cfq-iosched: fix next_rq computation Cfq has a bug in computation of next_rq, that affects transition between multiple sequential request streams in a single queue (e.g.: two sequential buffered writers of the same priority), causing the alternation between the two streams for a transient period. 8,0 1 18737 0.260400660 5312 D W 141653311 + 256 8,0 1 20839 0.273239461 5400 D W 141653567 + 256 8,0 1 20841 0.276343885 5394 D W 142803919 + 256 8,0 1 20843 0.279490878 5394 D W 141668927 + 256 8,0 1 20845 0.292459993 5400 D W 142804175 + 256 8,0 1 20847 0.295537247 5400 D W 141668671 + 256 8,0 1 20849 0.298656337 5400 D W 142804431 + 256 8,0 1 20851 0.311481148 5394 D W 141668415 + 256 8,0 1 20853 0.314421305 5394 D W 142804687 + 256 8,0 1 20855 0.318960112 5400 D W 142804943 + 256 The fix makes sure that the next_rq is computed from the last dispatched request, and not affected by merging. 8,0 1 37776 4.305161306 0 D W 141738087 + 256 8,0 1 37778 4.308298091 0 D W 141738343 + 256 8,0 1 37780 4.312885190 0 D W 141738599 + 256 8,0 1 37782 4.315933291 0 D W 141738855 + 256 8,0 1 37784 4.319064459 0 D W 141739111 + 256 8,0 1 37786 4.331918431 5672 D W 142803007 + 256 8,0 1 37788 4.334930332 5672 D W 142803263 + 256 8,0 1 37790 4.337902723 5672 D W 142803519 + 256 8,0 1 37792 4.342359774 5672 D W 142803775 + 256 8,0 1 37794 4.345318286 0 D W 142804031 + 256 Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-08 17:16:46 +01:00
Jens Axboe	622d32d3ec	Merge branch 'for-jens' of git://git.drbd.org/linux-2.6-drbd into for-2.6.33	2009-11-04 18:38:23 +01:00
Philipp Reisner	ed814525f2	Now it is equal to DRBD release 8.3.5 without compat crap Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2009-11-04 15:40:10 +01:00
Lars Ellenberg	83c38830b0	drbd: performance - don't lose unplug events Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2009-11-04 15:21:04 +01:00
Philipp Reisner	e656ec8ae2	Do not deadlock in drbd_disconnect() [bugz 258] When there are many blocks on the fly (ua), and the AL gets into "starving" mode (random IO, scattered all over the device), and the connections gets interrupted, the receiver thread deadlocks in the drbd_disconnect() code path. Affected are only nodes in Primary role. The bug triggers most likely on system that mirror over "long distances" Regression introduced shortly before 8.3.3 with git commit 31e0f1250f174ac1ee317f360943a0159e19edc8 Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2009-11-04 15:21:03 +01:00
Philipp Reisner	0a49216625	drbdsetup X resume-io should be usable to resume IO [Bugz 256] When IO gets frozen due to a broken fence-peer script, the user should be able to thaw IO by the resume-io command. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2009-11-04 15:21:01 +01:00
Lars Ellenberg	1352994b36	drbd: fix check for too large lower level device To check wether we are truncating a very large device due to limited meta data space, we need to check the ll_dev size. Also improve the printk to suggest "flexible" or "internal". Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2009-11-04 15:21:00 +01:00
Lars Ellenberg	ad19bf6e54	fix grammar in printk Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2009-11-04 15:20:59 +01:00
Lars Ellenberg	89e1838f5f	change default: by default, use socket buffer auto tuning Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2009-11-04 15:20:57 +01:00
H Hartley Sweeten	476d42f138	block/scsi_ioctl.c: quiet sparse noise Quiet sparse noise about symbol's not being declared. Symbol blk_default_cmd_filter is only used locally and should be static. The function blk_scsi_ioctl_init() is a fs_initcall and should also be static. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-04 09:10:33 +01:00
Changli Gao	cc56f7de7f	sendfile(): check f_op.splice_write() rather than f_op.sendpage() sendfile(2) was reworked with the splice infrastructure, but it still checks f_op.sendpage() instead of f_op.splice_write() wrongly. Although if f_op.sendpage() exists, f_op.splice_write() always exists at the same time currently, the assumption will be broken in future silently. This patch also brings a side effect: sendfile(2) can work with any output file. Some security checks related to f_op are added too. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-04 09:09:52 +01:00
Hideyuki Sasaki	f21121cde6	block/ps3: fix slow VRAM IO The current PS3 VRAM driver uses msleep() to wait for completion of RSX DMA transfers between system memory and VRAM. Depending on the system timing, the processing delay and overhead of this msleep() call can significantly impact VRAM driver IO. To avoid the condition, add a short duration (200 usec max) udelay() polling loop before entering the msleep() polling loop. Signed-off-by: Hideyuki Sasaki <xhide@rd.scei.sony.co.jp> Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com> Acked-by: Jim Paris <jim@jtan.com> Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-04 09:09:28 +01:00
Jens Axboe	e00ef79971	cfq-iosched: get rid of the coop_preempt flag We need to rework this logic post the cooperating cfq_queue merging, for now just get rid of it and Jeff Moyer will fix the fall out. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-04 08:54:55 +01:00
Jens Axboe	125c4f221a	cfq-iosched: fix merge error We ended up with testing the same condition twice, pretty pointless. Remove that first if. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-03 21:25:45 +01:00
Jens Axboe	2058297d2d	Merge branch 'for-linus' into for-2.6.33 Conflicts: block/cfq-iosched.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-03 21:14:39 +01:00
Jens Axboe	150e6c67f4	Merge branch 'cfq-2.6.33' into for-2.6.33	2009-11-03 21:12:10 +01:00
Shaohua Li	4b27e1bb44	cfq-iosched: limit coop preemption CFQ has an optimization for cooperated applications. if several io-context have close requests, they will get boost. But the optimization get abused. Considering thread a, b, which work on one file. a reads sectors s, s+2, s+4, ...; b reads sectors s+1, s+3, s +5, ... Both a and b are sequential read, so they can open idle window. a reads a sector s and goes to idle window and wakeup b. b reads sector s+1, since in current implementation, cfq_should_preempt() thinks a and b are cooperators, b will preempt a. b then reads sector s+1 and goes to idle window and wakeup a. for the same reason, a will preempt b and reads s+2. a and b will continue the circle. The circle will be very long, and a and b will occupy whole disk queue. Other applications will nearly have no chance to run. Fix this limiting coop preempt until a queue is scheduled normally again. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-03 20:25:02 +01:00

1 2 3 4 5 ...

168126 Commits