kernel_optimize_test/fs
Shakeel Butt d46eb14b73 fs: fsnotify: account fsnotify metadata to kmemcg
Patch series "Directed kmem charging", v8.

The Linux kernel's memory cgroup allows limiting the memory usage of the
jobs running on the system to provide isolation between the jobs.  All
the kernel memory allocated in the context of the job and marked with
__GFP_ACCOUNT will also be included in the memory usage and be limited
by the job's limit.

The kernel memory can only be charged to the memcg of the process in
whose context kernel memory was allocated.  However there are cases
where the allocated kernel memory should be charged to the memcg
different from the current processes's memcg.  This patch series
contains two such concrete use-cases i.e.  fsnotify and buffer_head.

The fsnotify event objects can consume a lot of system memory for large
or unlimited queues if there is either no or slow listener.  The events
are allocated in the context of the event producer.  However they should
be charged to the event consumer.  Similarly the buffer_head objects can
be allocated in a memcg different from the memcg of the page for which
buffer_head objects are being allocated.

To solve this issue, this patch series introduces mechanism to charge
kernel memory to a given memcg.  In case of fsnotify events, the memcg
of the consumer can be used for charging and for buffer_head, the memcg
of the page can be charged.  For directed charging, the caller can use
the scope API memalloc_[un]use_memcg() to specify the memcg to charge
for all the __GFP_ACCOUNT allocations within the scope.

This patch (of 2):

A lot of memory can be consumed by the events generated for the huge or
unlimited queues if there is either no or slow listener.  This can cause
system level memory pressure or OOMs.  So, it's better to account the
fsnotify kmem caches to the memcg of the listener.

However the listener can be in a different memcg than the memcg of the
producer and these allocations happen in the context of the event
producer.  This patch introduces remote memcg charging API which the
producer can use to charge the allocations to the memcg of the listener.

There are seven fsnotify kmem caches and among them allocations from
dnotify_struct_cache, dnotify_mark_cache, fanotify_mark_cache and
inotify_inode_mark_cachep happens in the context of syscall from the
listener.  So, SLAB_ACCOUNT is enough for these caches.

The objects from fsnotify_mark_connector_cachep are not accounted as
they are small compared to the notification mark or events and it is
unclear whom to account connector to since it is shared by all events
attached to the inode.

The allocations from the event caches happen in the context of the event
producer.  For such caches we will need to remote charge the allocations
to the listener's memcg.  Thus we save the memcg reference in the
fsnotify_group structure of the listener.

This patch has also moved the members of fsnotify_group to keep the size
same, at least for 64 bit build, even with additional member by filling
the holes.

[shakeelb@google.com: use GFP_KERNEL_ACCOUNT rather than open-coding it]
  Link: http://lkml.kernel.org/r/20180702215439.211597-1-shakeelb@google.com
Link: http://lkml.kernel.org/r/20180627191250.209150-2-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:30 -07:00
..
9p get rid of 'opened' argument of ->atomic_open() - part 3 2018-07-12 10:04:20 -04:00
adfs adfs: don't put inodes into icache 2018-08-03 16:03:33 -04:00
affs
afs Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-08-15 15:04:25 -07:00
autofs autofs: fix slab out of bounds read in getname_kernel() 2018-07-14 11:11:09 -07:00
befs
bfs
btrfs btrfs: readpages() should submit IO as read-ahead 2018-08-17 16:20:29 -07:00
cachefiles cachefiles: Wait rather than BUG'ing on "Unexpected object collision" 2018-07-25 14:49:00 +01:00
ceph Merge branch 'work.open3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-08-13 19:58:36 -07:00
cifs smb3/cifs fixes (including 8 for stable). Improved tracing, stats, snapshots (previous version mounts work now), performance (compounding enabled for statfs) 2018-08-13 22:32:11 -07:00
coda
configfs configfs: fix registered group removal 2018-07-17 06:14:07 -07:00
cramfs
crypto
debugfs
devpts
dlm
ecryptfs
efivarfs efivars: Call guid_parse() against guid_t type of variable 2018-07-22 14:13:44 +02:00
efs
exofs exofs: use bio_clone_fast in _write_mirror 2018-07-24 14:43:20 -06:00
exportfs
ext2 dax: remove VM_MIXEDMAP for fsdax and device dax 2018-08-17 16:20:27 -07:00
ext4 ext4: readpages() should submit IO as read-ahead 2018-08-17 16:20:29 -07:00
f2fs mpage: mpage_readpages() should submit IO as read-ahead 2018-08-17 16:20:29 -07:00
fat fat: fix memory allocation failure handling of match_strdup() 2018-07-21 12:50:46 -07:00
freevxfs
fscache fscache: Fix reference overput in fscache_attach_object() error handling 2018-07-25 14:49:00 +01:00
fuse Merge branch 'work.mkdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-08-13 20:25:58 -07:00
gfs2 gfs2 4.19 merge 2018-08-15 22:40:03 -07:00
hfs new helper: inode_fake_hash() 2018-08-03 16:03:32 -04:00
hfsplus
hostfs vfs: discard ATTR_ATTR_FLAG 2018-08-17 16:20:28 -07:00
hpfs fs/hpfs: extend gmt_to_local() conversion to 64-bit times 2018-08-17 16:20:27 -07:00
hugetlbfs Merge branch 'work.open3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-08-13 19:58:36 -07:00
isofs
jbd2 jbd2: replace current_kernel_time64 with ktime equivalent 2018-07-29 15:51:47 -04:00
jffs2 jffs2: use unsigned 32-bit timstamps consistently 2018-07-18 16:44:01 +02:00
jfs Just one jfs patch for 4.19 2018-08-15 22:47:23 -07:00
kernfs kernfs: allow creating kernfs objects with arbitrary uid/gid 2018-07-20 23:44:35 -07:00
lockd
minix
nfs Merge branch 'work.mkdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-08-13 20:25:58 -07:00
nfs_common
nfsd IMA: don't propagate opened through the entire thing 2018-07-12 10:04:19 -04:00
nilfs2
nls
notify fs: fsnotify: account fsnotify metadata to kmemcg 2018-08-17 16:20:30 -07:00
ntfs ntfs: mft: remove VLA usage 2018-08-17 16:20:27 -07:00
ocfs2 ocfs2: make several functions and variables static (and some const) 2018-08-17 16:20:28 -07:00
omfs
openpromfs
orangefs orangefs: remove redundant pointer orangefs_inode 2018-08-14 12:07:14 -04:00
overlayfs
proc fs/proc/task_mmu.c: fix Locked field in /proc/pid/smaps* 2018-07-14 11:11:09 -07:00
pstore pstore: add zstd compression support 2018-08-03 18:12:18 -07:00
qnx4
qnx6
quota
ramfs
reiserfs reiserfs: fix buffer overflow with long warning messages 2018-07-14 11:11:10 -07:00
romfs
squashfs Squashfs: Compute expected length from inode size rather than block length 2018-08-02 09:34:02 -07:00
sysfs SCSI misc on 20180815 2018-08-15 22:06:26 -07:00
sysv
tracefs
ubifs
udf Merge branch 'work.mkdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-08-13 20:25:58 -07:00
ufs fs/ufs: use ktime_get_real_seconds for sb and cg timestamps 2018-08-17 16:20:27 -07:00
xfs dax: remove VM_MIXEDMAP for fsdax and device dax 2018-08-17 16:20:27 -07:00
aio.c Merge branch 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-08-13 20:56:23 -07:00
anon_inodes.c anon_inode_getfile(): switch to alloc_file_pseudo() 2018-07-12 10:04:27 -04:00
attr.c fs: Fix attr.c kernel-doc 2018-07-03 16:44:45 -04:00
bad_inode.c get rid of 'opened' argument of ->atomic_open() - part 3 2018-07-12 10:04:20 -04:00
binfmt_aout.c
binfmt_elf_fdpic.c
binfmt_elf.c Here are the main MIPS changes for 4.19. 2018-08-13 19:24:32 -07:00
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c turn filp_clone_open() into inline wrapper for dentry_open() 2018-07-10 23:29:03 -04:00
binfmt_script.c
block_dev.c for-4.19/block-20180812 2018-08-14 10:23:25 -07:00
buffer.c
char_dev.c
compat_binfmt_elf.c
compat_ioctl.c media: dvb/audio.h: get rid of unused APIs 2018-07-30 16:21:49 -04:00
compat.c
coredump.c
d_path.c
dax.c dax: dax_layout_busy_page() warn on !exceptional 2018-07-29 16:59:16 -04:00
dcache.c fs/dcache.c: fix kmemcheck splat at take_dentry_name_snapshot() 2018-08-17 16:20:28 -07:00
dcookies.c
direct-io.c
drop_caches.c
eventfd.c Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
eventpoll.c Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
exec.c mm: fix vma_is_anonymous() false-positives 2018-07-26 19:38:03 -07:00
fcntl.c
fhandle.c
file_table.c make alloc_file() static 2018-07-12 10:04:29 -04:00
file.c
filesystems.c
fs_pin.c
fs_struct.c
fs-writeback.c
inode.c Merge branch 'work.mkdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-08-13 20:25:58 -07:00
internal.h Merge branch 'iomap-4.19-merge' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux 2018-08-13 22:29:03 -07:00
ioctl.c
iomap.c Changes for 4.19: 2018-08-14 08:56:02 -07:00
Kconfig
Kconfig.binfmt kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt 2018-08-02 08:06:55 +09:00
libfs.c
locks.c File locking fixes and enhancements for v4.19 2018-08-13 21:56:50 -07:00
Makefile
mbcache.c
mount.h
mpage.c mpage: mpage_readpages() should submit IO as read-ahead 2018-08-17 16:20:29 -07:00
namei.c Merge branches 'work.misc' and 'work.dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-08-13 21:28:25 -07:00
namespace.c fix __legitimize_mnt()/mntput() race 2018-08-09 17:51:32 -04:00
no-block.c
nsfs.c
open.c ->atomic_open(): return 0 in all success cases 2018-07-12 10:04:21 -04:00
pipe.c Merge branch 'work.open3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-08-13 19:58:36 -07:00
pnode.c
pnode.h
posix_acl.c
proc_namespace.c
read_write.c
readdir.c
select.c Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
seq_file.c fs/seq_file.c: simplify seq_file iteration code and interface 2018-08-17 16:20:28 -07:00
signalfd.c
splice.c
stack.c
stat.c
statfs.c kernel: add kcompat_sys_{f,}statfs64() 2018-07-12 14:49:48 +01:00
super.c
sync.c
timerfd.c Merge branch 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-08-13 20:56:23 -07:00
userfaultfd.c userfaultfd: remove uffd flags from vma->vm_flags if UFFD_EVENT_FORK fails 2018-08-02 16:03:40 -07:00
utimes.c
xattr.c