kernel_optimize_test/mm
Suparna Bhattacharya b5c44c2147 [PATCH] fix for __generic_file_aio_read() to return 0 on EOF
I came across the following problem while running ltp-aiodio testcases from
ltp-full-20050405 on linux-2.6.12-rc3-mm3.  I tried running the tests with
EXT3 as well as JFS filesystems.

One or two fsx-linux testcases were hung after some time.  These testcases
were hanging at wait_for_all_aios().

Debugging shows that there were some iocbs which were not getting completed
eventhough the last retry for those returned -EIOCBQUEUED.  Also all such
pending iocbs represented READ operation.

Further debugging revealed that all such iocbs hit EOF in the DIO layer.
To be more precise, the "pos" from which they were trying to read was
greater than the "size" of the file.  So the generic_file_direct_IO
returned 0.

This happens rarely as there is already a check in
__generic_file_aio_read(), for whether "pos" < "size" before calling direct
IO routine.

>size = i_size_read(inode);
>if (pos < size) {
>	  retval = generic_file_direct_IO(READ, iocb,
>                               iov, pos, nr_segs);

But for READ, we are taking the inode->i_sem only in the DIO layer.  So it
is possible that some other process can change the size of the file before
we take the i_sem.  In such a case ( when "pos" > "size"), the
__generic_file_aio_read() would return -EIOCBQUEUED even though there were
no I/O requests submitted by the DIO layer.  This would cause the AIO layer
to expect aio_complete() for THE iocb, which doesnot happen.  And thus the
test hangs forever, waiting for an I/O completion, where there are no
requests submitted at all.

The following patch makes __generic_file_aio_read() return 0 (instead of
returning -EIOCBQUEUED), on getting 0 from generic_file_direct_IO(), so
that the AIO layer does the aio_complete().

Testing:

I have tested the patch on a SMP machine(with 2 Pentium 4 (HT)) running
linux-2.6.12-rc3-mm3.  I ran the ltp-aiodio testcases and none of the
fsx-linux tests hung.  Also the aio-stress tests ran without any problem.

Signed-off-by: Suzuki K P <suzuki@in.ibm.com>
Signed-off-by: Suparna Bhattacharya <suparna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-21 16:45:24 -07:00
..
bootmem.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
fadvise.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
filemap.c [PATCH] fix for __generic_file_aio_read() to return 0 on EOF 2005-05-21 16:45:24 -07:00
fremap.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
highmem.c [PATCH] count bounce buffer pages in vmstat 2005-05-01 08:58:37 -07:00
hugetlb.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
internal.h Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
madvise.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
Makefile Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
memory.c [PATCH] do_swap_page() can map random data if swap read fails 2005-05-17 07:59:20 -07:00
mempolicy.c [PATCH] mempolicy.c GFP fix 2005-04-24 12:28:34 -07:00
mempool.c [PATCH] use smp_mb/wmb/rmb where possible 2005-05-01 08:58:47 -07:00
mincore.c [PATCH] freepgt: sys_mincore ignore FIRST_USER_PGD_NR 2005-04-19 13:29:20 -07:00
mlock.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
mmap.c Fix get_unmapped_area sanity tests 2005-05-19 22:43:37 -07:00
mprotect.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
mremap.c [PATCH] mm acct accounting fix 2005-05-17 07:59:12 -07:00
msync.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
nommu.c [PATCH] mm/nommu.c: try to fix __vmalloc 2005-05-17 07:59:17 -07:00
oom_kill.c [PATCH] oom-killer disable for iscsi/lvm2/multipath userland critical sections 2005-04-16 15:24:05 -07:00
page_alloc.c [IA64] Export node_online_map and node_possible_map 2005-05-03 12:09:32 -07:00
page_io.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
page-writeback.c [PATCH] DocBook: fix some descriptions 2005-05-01 08:59:26 -07:00
pdflush.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
prio_tree.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
readahead.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
rmap.c [PATCH] mm: fix rss counter being incremented when unmapping 2005-05-17 07:59:12 -07:00
shmem.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
slab.c [PATCH] Change synchronize_kernel to _rcu and _sched 2005-05-01 08:59:04 -07:00
swap_state.c [PATCH] mm: use __GFP_NOMEMALLOC 2005-05-01 08:58:37 -07:00
swap.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
swapfile.c [PATCH] swapout oops fix 2005-05-17 07:59:18 -07:00
thrash.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
tiny-shmem.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
truncate.c [PATCH] DocBook: fix some descriptions 2005-05-01 08:59:26 -07:00
vmalloc.c [PATCH] x86_64: Fixed guard page handling again in iounmap 2005-05-20 15:48:20 -07:00
vmscan.c [PATCH] vmscan: pageout(): remove unneeded test 2005-04-16 15:24:06 -07:00