kernel_optimize_test/fs
Nathan Zimmer 4a8c7bb59a mm/mempolicy.c: convert the shared_policy lock to a rwlock
When running the SPECint_rate gcc on some very large boxes it was
noticed that the system was spending lots of time in
mpol_shared_policy_lookup().  The gamess benchmark can also show it and
is what I mostly used to chase down the issue since the setup for that I
found to be easier.

To be clear the binaries were on tmpfs because of disk I/O requirements.
We then used text replication to avoid icache misses and having all the
copies banging on the memory where the instruction code resides.  This
results in us hitting a bottleneck in mpol_shared_policy_lookup() since
lookup is serialised by the shared_policy lock.

I have only reproduced this on very large (3k+ cores) boxes.  The
problem starts showing up at just a few hundred ranks getting worse
until it threatens to livelock once it gets large enough.  For example
on the gamess benchmark at 128 ranks this area consumes only ~1% of
time, at 512 ranks it consumes nearly 13%, and at 2k ranks it is over
90%.

To alleviate the contention in this area I converted the spinlock to an
rwlock.  This allows a large number of lookups to happen simultaneously.
The results were quite good reducing this consumtion at max ranks to
around 2%.

[akpm@linux-foundation.org: tidy up code comments]
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Nadia Yvette Chambers <nyc@holomorphy.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14 16:00:49 -08:00
..
9p kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
adfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
affs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
afs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
autofs4 switch ->get_link() to delayed_call, kill ->put_link() 2015-12-30 13:01:03 -05:00
befs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
bfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
btrfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
cachefiles convert a bunch of open-coded instances of memdup_user_nul() 2016-01-04 10:26:58 -05:00
ceph kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
cifs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
coda kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
configfs Configfs changes for the 4.5 merge window: 2016-01-12 18:15:34 -08:00
cramfs don't put symlink bodies in pagecache into highmem 2015-12-08 22:41:36 -05:00
debugfs
devpts
dlm convert a bunch of open-coded instances of memdup_user_nul() 2016-01-04 10:26:58 -05:00
ecryptfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
efivarfs
efs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
exofs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
exportfs
ext2 kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
ext4 kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
f2fs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
fat kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
freevxfs don't put symlink bodies in pagecache into highmem 2015-12-08 22:41:36 -05:00
fscache
fuse kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
gfs2 kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
hfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
hfsplus kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
hostfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
hpfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
hugetlbfs mm/mempolicy.c: convert the shared_policy lock to a rwlock 2016-01-14 16:00:49 -08:00
isofs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
jbd2 fs: use block_device name vsprintf helper 2016-01-06 13:03:18 -05:00
jffs2 kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
jfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
kernfs Revert "kernfs: do not account ino_ida allocations to memcg" 2016-01-14 16:00:49 -08:00
lockd
logfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
minix kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
ncpfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
nfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
nfs_common
nfsd Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-01-12 17:11:47 -08:00
nilfs2 kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
nls
notify fsnotify: destroy marks with call_srcu instead of dedicated thread 2016-01-14 16:00:49 -08:00
ntfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
ocfs2 kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
omfs
openpromfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
overlayfs switch ->get_link() to delayed_call, kill ->put_link() 2015-12-30 13:01:03 -05:00
proc kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
pstore
qnx4 kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
qnx6 kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
quota
ramfs don't put symlink bodies in pagecache into highmem 2015-12-08 22:41:36 -05:00
reiserfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
romfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
squashfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
sysfs
sysv kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
tracefs
ubifs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
udf kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
ufs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
xfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
aio.c
anon_inodes.c
attr.c
bad_inode.c fs/bad_inode.c: is_bad_inode can be boolean 2015-12-06 21:17:14 -05:00
binfmt_aout.c
binfmt_elf_fdpic.c
binfmt_elf.c
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c
binfmt_script.c
block_dev.c kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
buffer.c fs: use block_device name vsprintf helper 2016-01-06 13:03:18 -05:00
char_dev.c
compat_binfmt_elf.c
compat_ioctl.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-01-12 17:11:47 -08:00
compat.c saner calling conventions for copy_mount_options() 2016-01-04 10:28:32 -05:00
coredump.c coredump: Use 64bit time for unix time of coredump 2015-12-06 21:17:17 -05:00
dax.c
dcache.c kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
dcookies.c
direct-io.c fix the regression from "direct-io: Fix negative return from dio read beyond eof" 2015-12-08 15:02:42 -05:00
drop_caches.c
eventfd.c
eventpoll.c
exec.c don't carry MAY_OPEN in op->acc_mode 2016-01-04 10:28:40 -05:00
fcntl.c fcntl: allow to set O_DIRECT flag on pipe 2016-01-09 02:55:37 -05:00
fhandle.c
file_table.c
file.c kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
filesystems.c
fs_pin.c
fs_struct.c
fs-writeback.c
inode.c kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
internal.h Merge branch 'for-linus' into work.misc 2016-01-08 21:20:11 -05:00
ioctl.c Merge branch 'work.copy_file_range' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-01-12 16:30:34 -08:00
Kconfig File locking related changes for v4.5 (pile #1) 2016-01-12 15:46:17 -08:00
Kconfig.binfmt
libfs.c switch ->get_link() to delayed_call, kill ->put_link() 2015-12-30 13:01:03 -05:00
locks.c Merge branch 'work.copy_file_range' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-01-12 16:30:34 -08:00
Makefile
mbcache.c
mount.h
mpage.c
namei.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-01-12 17:11:47 -08:00
namespace.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-01-12 17:11:47 -08:00
no-block.c
nsfs.c
open.c don't carry MAY_OPEN in op->acc_mode 2016-01-04 10:28:40 -05:00
pipe.c
pnode.c
pnode.h
posix_acl.c xattr handlers: Simplify list operation 2015-12-13 19:46:12 -05:00
proc_namespace.c vfs: show_vfsstat: remove redundant initialization and check of error code 2015-12-06 21:17:16 -05:00
read_write.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-01-12 17:11:47 -08:00
readdir.c
select.c poll: plug an unused argument to do_poll 2016-01-06 08:26:52 -05:00
seq_file.c
signalfd.c
splice.c fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE 2016-01-09 02:55:35 -05:00
stack.c
stat.c
statfs.c
super.c fs: use block_device name vsprintf helper 2016-01-06 13:03:18 -05:00
sync.c
timerfd.c
userfaultfd.c
utimes.c
xattr.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-01-12 17:11:47 -08:00