kernel_optimize_test/kernel
David Hildenbrand ec62d04e3f kernel/resource: make release_mem_region_adjustable() never fail
Patch series "selective merging of system ram resources", v4.

Some add_memory*() users add memory in small, contiguous memory blocks.
Examples include virtio-mem, hyper-v balloon, and the XEN balloon.

This can quickly result in a lot of memory resources, whereby the actual
resource boundaries are not of interest (e.g., it might be relevant for
DIMMs, exposed via /proc/iomem to user space).  We really want to merge
added resources in this scenario where possible.

Resources are effectively stored in a list-based tree.  Having a lot of
resources not only wastes memory, it also makes traversing that tree more
expensive, and makes /proc/iomem explode in size (e.g., requiring
kexec-tools to manually merge resources when creating a kdump header.  The
current kexec-tools resource count limit does not allow for more than
~100GB of memory with a memory block size of 128MB on x86-64).

Let's allow to selectively merge system ram resources by specifying a new
flag for add_memory*().  Patch #5 contains a /proc/iomem example.  Only
tested with virtio-mem.

This patch (of 8):

Let's make sure splitting a resource on memory hotunplug will never fail.
This will become more relevant once we merge selected System RAM resources
- then, we'll trigger that case more often on memory hotunplug.

In general, this function is already unlikely to fail.  When we remove
memory, we free up quite a lot of metadata (memmap, page tables, memory
block device, etc.).  The only reason it could really fail would be when
injecting allocation errors.

All other error cases inside release_mem_region_adjustable() seem to be
sanity checks if the function would be abused in different context - let's
add WARN_ON_ONCE() in these cases so we can catch them.

[natechancellor@gmail.com: fix use of ternary condition in release_mem_region_adjustable]
  Link: https://lkml.kernel.org/r/20200922060748.2452056-1-natechancellor@gmail.com
  Link: https://github.com/ClangBuiltLinux/linux/issues/1159

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Leonardo Bras <leobras.c@gmail.com>
Cc: Libor Pechacek <lpechacek@suse.cz>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pingfan Liu <kernelfans@gmail.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Roger Pau Monn <roger.pau@citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Wei Liu <wei.liu@kernel.org>
Link: https://lkml.kernel.org/r/20200911103459.10306-2-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-16 11:11:17 -07:00
..
bpf objtool changes for v5.10: 2020-10-14 10:13:37 -07:00
cgroup cgroup: Zero sized write should be no-op 2020-09-30 13:52:06 -04:00
configs compiler: remove CONFIG_OPTIMIZE_INLINING entirely 2020-04-07 10:43:42 -07:00
debug treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
dma dma-mapping updates for 5.10 2020-10-15 14:43:29 -07:00
entry * Misc minor cleanups. 2020-10-12 10:51:02 -07:00
events These are the performance events changes for v5.10: 2020-10-12 14:14:35 -07:00
gcov gcov: add support for GCC 10.1 2020-09-11 09:33:54 -07:00
irq Surgery of the MSI interrupt handling to prepare the support of upcoming 2020-10-12 11:40:41 -07:00
kcsan kcsan: Use tracing-safe version of prandom 2020-08-30 21:50:13 -07:00
livepatch livepatch: Make klp_apply_object_relocs static 2020-05-11 00:31:38 +02:00
locking Merge branch 'locking/urgent' into locking/core, to pick up fixes 2020-10-09 08:55:17 +02:00
power Power management updates for 5.10-rc1 2020-10-14 10:45:41 -07:00
printk Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2020-10-15 15:11:56 -07:00
rcu A small set of updates for debug objects: 2020-10-12 11:21:24 -07:00
sched Power management updates for 5.10-rc1 2020-10-14 10:45:41 -07:00
time These are the locking updates for v5.10: 2020-10-12 13:06:20 -07:00
trace Char/Misc driver patches for 5.10-rc1 2020-10-15 10:01:51 -07:00
.gitignore
acct.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
async.c treewide: Remove uninitialized_var() usage 2020-07-16 12:35:15 -07:00
audit_fsnotify.c fsnotify: create method handle_inode_event() in fsnotify_operations 2020-07-27 23:25:50 +02:00
audit_tree.c \n 2020-08-06 19:29:51 -07:00
audit_watch.c fsnotify: create method handle_inode_event() in fsnotify_operations 2020-07-27 23:25:50 +02:00
audit.c audit: Remove redundant null check 2020-08-26 09:10:39 -04:00
audit.h audit: change unnecessary globals into statics 2020-08-17 20:26:58 -04:00
auditfilter.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
auditsc.c audit/stable-5.9 PR 20200803 2020-08-04 14:20:26 -07:00
backtracetest.c treewide: Replace DECLARE_TASKLET() with DECLARE_TASKLET_OLD() 2020-07-30 11:15:58 -07:00
bounds.c
capability.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
compat.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
configs.c
context_tracking.c context_tracking: Ensure that the critical path cannot be instrumented 2020-06-11 15:14:36 +02:00
cpu_pm.c notifier: Fix broken error handling pattern 2020-09-01 09:58:03 +02:00
cpu.c The changes in this cycle are: 2020-06-03 13:06:42 -07:00
crash_core.c kdump: append kernel build-id string to VMCOREINFO 2020-08-12 10:58:01 -07:00
crash_dump.c crash_dump: Remove no longer used saved_max_pfn 2020-04-15 11:21:54 +02:00
cred.c exec: Teach prepare_exec_creds how exec treats uids & gids 2020-05-20 14:44:21 -05:00
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c exit: support non-blocking pidfds 2020-09-04 12:31:30 +02:00
extable.c kernel/extable.c: use address-of operator on section symbols 2020-04-07 10:43:42 -07:00
fail_function.c
fork.c kernel-clone-v5.9 2020-10-14 14:32:52 -07:00
freezer.c
futex.c futex: Convert to use the preferred 'fallthrough' macro 2020-08-13 21:02:12 +02:00
gen_kheaders.sh kbuild: add variables for compression tools 2020-06-06 23:42:01 +09:00
groups.c mm: remove the pgprot argument to __vmalloc 2020-06-02 10:59:11 -07:00
hung_task.c kernel/hung_task.c: introduce sysctl to print all traces when a hung task is detected 2020-06-08 11:05:56 -07:00
iomem.c
irq_work.c irq_work, smp: Allow irq_work on call_single_queue 2020-05-28 10:54:15 +02:00
jump_label.c jump_label,module: Fix module lifetime for __jump_label_mod_text_reserved() 2020-09-01 09:58:04 +02:00
kallsyms.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kcov.c kcov: make some symbols static 2020-08-12 10:58:02 -07:00
kexec_core.c objtool: Rename frame.h -> objtool.h 2020-09-10 10:43:13 -05:00
kexec_elf.c
kexec_file.c fs/kernel_file_read: Add "offset" arg for partial reads 2020-10-05 13:37:04 +02:00
kexec_internal.h
kexec.c LSM: Introduce kernel_post_load_data() hook 2020-10-05 13:37:03 +02:00
kheaders.c
kmod.c kmod: remove redundant "be an" in the comment 2020-08-12 10:58:01 -07:00
kprobes.c This tree prepares to unify the kretprobe trampoline handler and make 2020-10-12 14:21:15 -07:00
ksysfs.c
kthread.c uaccess: add force_uaccess_{begin,end} helpers 2020-08-12 10:57:59 -07:00
latencytop.c sysctl: pass kernel pointers to ->proc_handler 2020-04-27 02:07:40 -04:00
Makefile static_call: Add inline static call infrastructure 2020-09-01 09:58:04 +02:00
module_signature.c
module_signing.c
module-internal.h
module.c Char/Misc driver patches for 5.10-rc1 2020-10-15 10:01:51 -07:00
notifier.c notifier: Fix broken error handling pattern 2020-09-01 09:58:03 +02:00
nsproxy.c nsproxy: support CLONE_NEWTIME with setns() 2020-07-08 11:14:22 +02:00
padata.c padata: fix possible padata_works_lock deadlock 2020-09-04 17:51:55 +10:00
panic.c panic: make print_oops_end_marker() static 2020-08-12 10:58:02 -07:00
params.c moduleparams: Add hexint type parameter 2020-07-28 13:44:53 +02:00
pid_namespace.c pid_namespace: use checkpoint_restore_ns_capable() for ns_last_pid 2020-07-19 20:14:42 +02:00
pid.c pidfd: support PIDFD_NONBLOCK in pidfd_open() 2020-09-04 12:34:50 +02:00
profile.c
ptrace.c
range.c
reboot.c arch: remove unicore32 port 2020-07-01 12:09:13 +03:00
regset.c regset: kill ->get() 2020-07-27 14:31:12 -04:00
relay.c kernel/relay.c: fix memleak on destroy relay channel 2020-08-21 09:52:53 -07:00
resource.c kernel/resource: make release_mem_region_adjustable() never fail 2020-10-16 11:11:17 -07:00
rseq.c
scs.c mm: memcontrol: account kernel stack per node 2020-08-07 11:33:25 -07:00
seccomp.c seccomp: Make duplicate listener detection non-racy 2020-10-08 13:17:47 -07:00
signal.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
smp.c smp: Fix a potential usage of stale nr_cpus 2020-07-22 10:22:04 +02:00
smpboot.c
smpboot.h
softirq.c softirq: Add debug check to __raise_softirq_irqoff() 2020-09-16 15:18:56 +02:00
stackleak.c stackleak: let stack_erasing_sysctl take a kernel pointer buffer 2020-09-19 13:13:39 -07:00
stacktrace.c stacktrace: Remove reliable argument from arch_stack_walk() callback 2020-09-18 14:24:16 +01:00
static_call.c static_call: Fix return type of static_call_init 2020-10-02 21:18:25 +02:00
stop_machine.c
sys_ni.c quota: simplify the quotactl compat handling 2020-09-17 13:00:46 -04:00
sys.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
sysctl-test.c
sysctl.c mm: allow a controlled amount of unfairness in the page lock 2020-09-17 10:26:41 -07:00
task_work.c task_work: only grab task signal lock when needed 2020-08-13 09:01:38 -06:00
taskstats.c
test_kprobes.c
torture.c torture: Dump ftrace at shutdown only if requested 2020-06-29 12:01:45 -07:00
tracepoint.c tracepoint: Fix out of sync data passing by static caller 2020-10-02 21:18:25 +02:00
tsacct.c
ucount.c ucount: Make sure ucounts in /proc/sys/user don't regress again 2020-04-07 21:51:27 +02:00
uid16.c
uid16.h
umh.c usermodehelper: reset umask to default before executing user process 2020-10-06 10:31:52 -07:00
up.c
user_namespace.c nsproxy: add struct nsset 2020-05-09 13:57:12 +02:00
user-return-notifier.c
user.c user.c: make uidhash_table static 2020-06-04 19:06:24 -07:00
usermode_driver.c umd: Stop using split_argv 2020-07-07 11:58:59 -05:00
utsname_sysctl.c sysctl: pass kernel pointers to ->proc_handler 2020-04-27 02:07:40 -04:00
utsname.c nsproxy: add struct nsset 2020-05-09 13:57:12 +02:00
watch_queue.c watch_queue: Limit the number of watches a user can hold 2020-08-17 09:39:18 -07:00
watchdog_hld.c
watchdog.c kernel/watchdog.c: convert {soft/hard}lockup boot parameters to sysctl aliases 2020-06-08 11:05:56 -07:00
workqueue_internal.h
workqueue.c treewide: Make all debug_obj_descriptors const 2020-09-24 21:56:25 +02:00