kernel_optimize_test/fs
Eric Dumazet 7e360c38ab fs: allow for more than 2^31 files
Andrew,

Could you please review this patch, you probably are the right guy to
take it, because it crosses fs and net trees.

Note : /proc/sys/fs/file-nr is a read-only file, so this patch doesnt
depend on previous patch (sysctl: fix min/max handling in
__do_proc_doulongvec_minmax())

Thanks !

[PATCH V4] fs: allow for more than 2^31 files

Robin Holt tried to boot a 16TB system and found af_unix was overflowing
a 32bit value :

<quote>

We were seeing a failure which prevented boot.  The kernel was incapable
of creating either a named pipe or unix domain socket.  This comes down
to a common kernel function called unix_create1() which does:

        atomic_inc(&unix_nr_socks);
        if (atomic_read(&unix_nr_socks) > 2 * get_max_files())
                goto out;

The function get_max_files() is a simple return of files_stat.max_files.
files_stat.max_files is a signed integer and is computed in
fs/file_table.c's files_init().

        n = (mempages * (PAGE_SIZE / 1024)) / 10;
        files_stat.max_files = n;

In our case, mempages (total_ram_pages) is approx 3,758,096,384
(0xe0000000).  That leaves max_files at approximately 1,503,238,553.
This causes 2 * get_max_files() to integer overflow.

</quote>

Fix is to let /proc/sys/fs/file-nr & /proc/sys/fs/file-max use long
integers, and change af_unix to use an atomic_long_t instead of
atomic_t.

get_max_files() is changed to return an unsigned long.
get_nr_files() is changed to return a long.

unix_nr_socks is changed from atomic_t to atomic_long_t, while not
strictly needed to address Robin problem.

Before patch (on a 64bit kernel) :
# echo 2147483648 >/proc/sys/fs/file-max
# cat /proc/sys/fs/file-max
-18446744071562067968

After patch:
# echo 2147483648 >/proc/sys/fs/file-max
# cat /proc/sys/fs/file-max
2147483648
# cat /proc/sys/fs/file-nr
704     0       2147483648

Reported-by: Robin Holt <holt@sgi.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: David Miller <davem@davemloft.net>
Reviewed-by: Robin Holt <holt@sgi.com>
Tested-by: Robin Holt <holt@sgi.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-10-25 21:18:20 -04:00
..
9p fs/9p: Don't use dotl version of mknod for dotu inode operations 2010-09-13 08:13:03 -05:00
adfs Merge branch 'vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl 2010-10-22 10:52:01 -07:00
affs Merge branch 'vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl 2010-10-22 10:52:01 -07:00
afs Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl 2010-10-22 10:52:56 -07:00
autofs Merge branch 'vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl 2010-10-22 10:52:01 -07:00
autofs4 Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl 2010-10-22 10:52:56 -07:00
befs
bfs BKL: Remove BKL from BFS 2010-10-04 21:10:35 +02:00
btrfs Merge branch 'for-2.6.37/barrier' of git://git.kernel.dk/linux-2.6-block 2010-10-22 17:07:18 -07:00
cachefiles llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
ceph ceph: do not carry i_lock for readdir from dcache 2010-10-20 15:38:27 -07:00
cifs Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 2010-10-22 17:52:29 -07:00
coda Coda: replace BKL with mutex 2010-10-25 08:02:40 -07:00
configfs
cramfs cramfs: only unlock new inodes 2010-08-18 01:01:33 -04:00
debugfs llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
devpts
dlm Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm 2010-10-22 17:33:16 -07:00
ecryptfs Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2010-10-24 13:41:39 -07:00
efs
exofs fs: add sync_inode_metadata 2010-10-25 21:18:19 -04:00
exportfs
ext2 fs: add sync_inode_metadata 2010-10-25 21:18:19 -04:00
ext3 fs: kill block_prepare_write 2010-10-25 21:18:20 -04:00
ext4 fs: kill block_prepare_write 2010-10-25 21:18:20 -04:00
fat Merge branch 'for-2.6.37/barrier' of git://git.kernel.dk/linux-2.6-block 2010-10-22 17:07:18 -07:00
freevxfs BKL: remove BKL from freevxfs 2010-10-21 18:48:09 +02:00
fscache Add a dummy printk function for the maintenance of unused printks 2010-08-12 09:51:35 -07:00
fuse Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl 2010-10-22 10:52:56 -07:00
gfs2 fs: kill block_prepare_write 2010-10-25 21:18:20 -04:00
hfs Merge branch 'vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl 2010-10-22 10:52:01 -07:00
hfsplus hfsplus: fix getxattr return value 2010-10-15 05:45:00 -07:00
hostfs Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2010-10-24 13:41:39 -07:00
hpfs Merge branch 'vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl 2010-10-22 10:52:01 -07:00
hppfs llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
hugetlbfs llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
isofs isofs: Fix isofs_get_blocks for 8TB files 2010-10-25 21:18:20 -04:00
jbd Merge branch 'for-2.6.37/barrier' of git://git.kernel.dk/linux-2.6-block 2010-10-22 17:07:18 -07:00
jbd2 Merge branch 'for-2.6.37/barrier' of git://git.kernel.dk/linux-2.6-block 2010-10-22 17:07:18 -07:00
jffs2 BKL: Remove BKL from jffs2 2010-10-04 21:10:50 +02:00
jfs Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2010-10-24 13:41:39 -07:00
lockd
logfs llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
minix minix: fix regression in minix_mkdir() 2010-09-09 18:57:25 -07:00
ncpfs ncpfs: Lock socket in ncpfs while setting its callbacks 2010-10-05 11:02:14 +02:00
nfs Merge branch 'vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl 2010-10-22 10:52:01 -07:00
nfs_common
nfsd fs: add sync_inode_metadata 2010-10-25 21:18:19 -04:00
nilfs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 2010-10-23 01:26:47 -07:00
nls
notify Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl 2010-10-22 10:52:56 -07:00
ntfs BKL: Remove BKL from NTFS 2010-10-04 21:10:41 +02:00
ocfs2 fs: kill block_prepare_write 2010-10-25 21:18:20 -04:00
omfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bcopeland/omfs 2010-08-10 11:47:36 -07:00
openpromfs
partitions Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block 2010-10-25 07:45:10 -07:00
proc Revert "tty: Add a new file /proc/tty/consoles" 2010-10-23 08:14:12 -07:00
qnx4 BKL: remove BKL from qnx4 2010-10-21 18:48:04 +02:00
quota Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2010-08-10 11:26:52 -07:00
ramfs check ATTR_SIZE contraints in inode_change_ok 2010-08-09 16:47:39 -04:00
reiserfs fs: kill block_prepare_write 2010-10-25 21:18:20 -04:00
romfs llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
smbfs Merge branch 'vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl 2010-10-22 10:52:01 -07:00
squashfs Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl 2010-10-22 10:52:56 -07:00
sysfs Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 2010-10-22 19:36:42 -07:00
sysv fs/sysv/super.c: add support for non-PDP11 v7 filesystems 2010-08-11 08:59:23 -07:00
ubifs Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6 2010-10-22 16:34:03 -07:00
udf Merge branch 'vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl 2010-10-22 10:52:01 -07:00
ufs Merge branch 'vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl 2010-10-22 10:52:01 -07:00
xfs fs: kill block_prepare_write 2010-10-25 21:18:20 -04:00
aio.c aio: do not return ERESTARTSYS as a result of AIO 2010-09-22 17:22:39 -07:00
anon_inodes.c
attr.c check ATTR_SIZE contraints in inode_change_ok 2010-08-09 16:47:39 -04:00
bad_inode.c bkl: Remove locked .ioctl file operation 2010-08-14 00:24:24 +02:00
binfmt_aout.c Don't dump task struct in a.out core-dumps 2010-10-14 10:57:40 -07:00
binfmt_elf_fdpic.c
binfmt_elf.c ARM: 6342/1: fix ASLR of PIE executables 2010-10-08 10:02:53 +01:00
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
binfmt_script.c Make do_execve() take a const filename pointer 2010-08-17 18:07:43 -07:00
binfmt_som.c
bio-integrity.c fs/bio-integrity.c: return -ENOMEM on kmalloc failure 2010-08-23 13:36:59 +02:00
bio.c block: unify flags for struct bio and struct request 2010-08-07 18:20:39 +02:00
block_dev.c block: remove BLKDEV_IFL_WAIT 2010-09-16 20:52:58 +02:00
buffer.c fs: kill block_prepare_write 2010-10-25 21:18:20 -04:00
char_dev.c Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl 2010-10-22 10:52:56 -07:00
compat_binfmt_elf.c
compat_ioctl.c fix rawctl compat ioctls breakage on amd64 and itanic 2010-10-19 11:29:54 +02:00
compat.c Prevent freeing uninitialized pointer in compat_do_readv_writev 2010-09-22 17:22:38 -07:00
dcache.c fs: brlock vfsmount_lock 2010-08-18 08:35:48 -04:00
dcookies.c
direct-io.c O_DIRECT: fix the splitting up of contiguous I/O 2010-09-09 18:57:22 -07:00
drop_caches.c simplify checks for I_CLEAR/I_FREEING 2010-08-09 16:47:44 -04:00
eventfd.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
eventpoll.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
exec.c Export dump_{write,seek} to binary loader modules 2010-10-14 19:15:28 -07:00
fcntl.c vfs: take O_NONBLOCK out of the O_* uniqueness test 2010-09-09 18:57:25 -07:00
fifo.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
file_table.c fs: allow for more than 2^31 files 2010-10-25 21:18:20 -04:00
file.c vfs: use kmalloc() to allocate fdmem if possible 2010-08-11 08:59:02 -07:00
filesystems.c
fs_struct.c fs: fs_struct rwlock to spinlock 2010-08-18 08:35:46 -04:00
fs-writeback.c fs: add sync_inode_metadata 2010-10-25 21:18:19 -04:00
generic_acl.c vfs: update ctime when changing the file's permission by setfacl 2010-08-18 01:04:22 -04:00
inode.c fs: mark destroy_inode static 2010-10-25 21:18:19 -04:00
internal.h fs: brlock vfsmount_lock 2010-08-18 08:35:48 -04:00
ioctl.c bkl: Remove locked .ioctl file operation 2010-08-14 00:24:24 +02:00
ioprio.c
Kconfig BKL: introduce CONFIG_BKL. 2010-10-21 15:44:13 +02:00
Kconfig.binfmt
libfs.c fs: add sync_inode_metadata 2010-10-25 21:18:19 -04:00
locks.c fs/locks.c: prepare for BKL removal 2010-10-05 11:02:04 +02:00
Makefile
mbcache.c mbcache: Limit the maximum number of cache entries 2010-08-18 06:24:41 -04:00
mpage.c
namei.c fs: move permission check back into __lookup_hash 2010-10-25 21:18:19 -04:00
namespace.c BKL: Remove BKL from do_new_mount() 2010-10-04 21:10:43 +02:00
nfsctl.c
no-block.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
open.c fs: cleanup files_lock locking 2010-08-18 08:35:47 -04:00
pipe.c pipe: fix failure to return error code on ->confirm() 2010-10-21 14:56:33 +02:00
pnode.c fs: brlock vfsmount_lock 2010-08-18 08:35:48 -04:00
pnode.h
posix_acl.c
read_write.c vfs: make no_llseek the default 2010-10-15 15:53:46 +02:00
read_write.h
readdir.c vfs: fix warning: 'dirent' is used uninitialized in this function 2010-08-09 20:45:05 -07:00
select.c
seq_file.c fs/seq_file.c: Remove unnecessary casts of private_data 2010-09-23 13:28:23 +02:00
signalfd.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
splice.c splice: fix misuse of SPLICE_F_NONBLOCK 2010-08-07 18:52:56 +02:00
stack.c
stat.c Mark arguments to certain syscalls as being const 2010-08-13 16:53:13 -07:00
statfs.c add f_flags to struct statfs(64) 2010-08-09 16:48:44 -04:00
super.c fs: scale files_lock 2010-08-18 08:35:48 -04:00
sync.c get rid of file_fsync() 2010-08-09 16:47:43 -04:00
timerfd.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
utimes.c Mark arguments to certain syscalls as being const 2010-08-13 16:53:13 -07:00
xattr_acl.c
xattr.c