kernel_optimize_test/kernel
Linus Torvalds 4f30a60aa7 close-range-v5.9
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXygcpgAKCRCRxhvAZXjc
 ogPeAQDv1ncqtNroFAC4pJ4tQhH7JSjW0OltiMk/AocY/J2SdQD9GJ15luYJ0/om
 697q/Z68sndRynhdoZlMuf3oYuBlHQw=
 =3ZhE
 -----END PGP SIGNATURE-----

Merge tag 'close-range-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull close_range() implementation from Christian Brauner:
 "This adds the close_range() syscall. It allows to efficiently close a
  range of file descriptors up to all file descriptors of a calling
  task.

  This is coordinated with the FreeBSD folks which have copied our
  version of this syscall and in the meantime have already merged it in
  April 2019:

    https://reviews.freebsd.org/D21627
    https://svnweb.freebsd.org/base?view=revision&revision=359836

  The syscall originally came up in a discussion around the new mount
  API and making new file descriptor types cloexec by default. During
  this discussion, Al suggested the close_range() syscall.

  First, it helps to close all file descriptors of an exec()ing task.
  This can be done safely via (quoting Al's example from [1] verbatim):

        /* that exec is sensitive */
        unshare(CLONE_FILES);
        /* we don't want anything past stderr here */
        close_range(3, ~0U);
        execve(....);

  The code snippet above is one way of working around the problem that
  file descriptors are not cloexec by default. This is aggravated by the
  fact that we can't just switch them over without massively regressing
  userspace. For a whole class of programs having an in-kernel method of
  closing all file descriptors is very helpful (e.g. demons, service
  managers, programming language standard libraries, container managers
  etc.).

  Second, it allows userspace to avoid implementing closing all file
  descriptors by parsing through /proc/<pid>/fd/* and calling close() on
  each file descriptor and other hacks. From looking at various
  large(ish) userspace code bases this or similar patterns are very
  common in service managers, container runtimes, and programming
  language runtimes/standard libraries such as Python or Rust.

  In addition, the syscall will also work for tasks that do not have
  procfs mounted and on kernels that do not have procfs support compiled
  in. In such situations the only way to make sure that all file
  descriptors are closed is to call close() on each file descriptor up
  to UINT_MAX or RLIMIT_NOFILE, OPEN_MAX trickery.

  Based on Linus' suggestion close_range() also comes with a new flag
  CLOSE_RANGE_UNSHARE to more elegantly handle file descriptor dropping
  right before exec. This would usually be expressed in the sequence:

        unshare(CLONE_FILES);
        close_range(3, ~0U);

  as pointed out by Linus it might be desirable to have this be a part
  of close_range() itself under a new flag CLOSE_RANGE_UNSHARE which
  gets especially handy when we're closing all file descriptors above a
  certain threshold.

  Test-suite as always included"

* tag 'close-range-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  tests: add CLOSE_RANGE_UNSHARE tests
  close_range: add CLOSE_RANGE_UNSHARE
  tests: add close_range() tests
  arch: wire-up close_range()
  open: add close_range()
2020-08-04 15:12:02 -07:00
..
bpf Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf 2020-07-31 17:19:47 -07:00
cgroup for-5.9/block-20200802 2020-08-03 11:57:03 -07:00
configs
debug Remove uninitialized_var() macro for v5.9-rc1 2020-08-04 13:49:43 -07:00
dma Remove uninitialized_var() macro for v5.9-rc1 2020-08-04 13:49:43 -07:00
events Remove uninitialized_var() macro for v5.9-rc1 2020-08-04 13:49:43 -07:00
gcov treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
irq tasklets API update for v5.9-rc1 2020-08-04 13:40:35 -07:00
kcsan Merge branch 'kcsan' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into locking/core 2020-08-01 09:26:27 +02:00
livepatch
locking Remove uninitialized_var() macro for v5.9-rc1 2020-08-04 13:49:43 -07:00
power Merge branches 'pm-sleep', 'pm-domains', 'powercap' and 'pm-tools' 2020-08-03 13:12:44 +02:00
printk Revert "kernel/printk: add kmsg SEEK_CUR handling" 2020-06-21 20:47:20 -07:00
rcu Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu 2020-07-31 00:15:53 +02:00
sched Power management updates for 5.9-rc1 2020-08-03 20:28:08 -07:00
time threads-v5.9 2020-08-04 14:40:07 -07:00
trace Remove uninitialized_var() macro for v5.9-rc1 2020-08-04 13:49:43 -07:00
.gitignore
acct.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
async.c treewide: Remove uninitialized_var() usage 2020-07-16 12:35:15 -07:00
audit_fsnotify.c
audit_tree.c audit: Use struct_size() helper in alloc_chunk 2020-06-17 16:43:11 -04:00
audit_watch.c
audit.c audit/stable-5.9 PR 20200803 2020-08-04 14:20:26 -07:00
audit.h revert: 1320a4052e ("audit: trigger accompanying records when no rules present") 2020-07-29 10:00:36 -04:00
auditfilter.c
auditsc.c audit/stable-5.9 PR 20200803 2020-08-04 14:20:26 -07:00
backtracetest.c treewide: Replace DECLARE_TASKLET() with DECLARE_TASKLET_OLD() 2020-07-30 11:15:58 -07:00
bounds.c
capability.c
compat.c
configs.c
context_tracking.c context_tracking: Ensure that the critical path cannot be instrumented 2020-06-11 15:14:36 +02:00
cpu_pm.c
cpu.c The changes in this cycle are: 2020-06-03 13:06:42 -07:00
crash_core.c crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo 2020-07-02 17:56:11 +01:00
crash_dump.c
cred.c
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c Merge branch 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2020-08-04 14:27:25 -07:00
extable.c
fail_function.c
fork.c close-range-v5.9 2020-08-04 15:12:02 -07:00
freezer.c
futex.c Remove uninitialized_var() macro for v5.9-rc1 2020-08-04 13:49:43 -07:00
gen_kheaders.sh kbuild: add variables for compression tools 2020-06-06 23:42:01 +09:00
groups.c mm: remove the pgprot argument to __vmalloc 2020-06-02 10:59:11 -07:00
hung_task.c kernel/hung_task.c: introduce sysctl to print all traces when a hung task is detected 2020-06-08 11:05:56 -07:00
iomem.c
irq_work.c
jump_label.c
kallsyms.c Linux 5.8-rc7 2020-07-28 13:18:01 +02:00
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kcov.c kcov: check kcov_softirq in kcov_remote_stop() 2020-06-10 19:14:17 -07:00
kexec_core.c
kexec_elf.c
kexec_file.c kexec: do not verify the signature without the lockdown or mandatory signature 2020-06-26 00:27:36 -07:00
kexec_internal.h
kexec.c
kheaders.c
kmod.c
kprobes.c kprobes: Remove unnecessary module_mutex locking from kprobe_optimizer() 2020-07-28 13:19:05 +02:00
ksysfs.c
kthread.c Merge branch 'sched/urgent' 2020-07-08 11:38:59 +02:00
latencytop.c
Makefile Merge branch 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2020-08-04 14:27:25 -07:00
module_signature.c
module_signing.c
module-internal.h
module.c Refactor kallsyms_show_value() users for correct cred 2020-07-09 13:09:30 -07:00
notifier.c mm: remove vmalloc_sync_(un)mappings() 2020-06-02 10:59:12 -07:00
nsproxy.c nsproxy: support CLONE_NEWTIME with setns() 2020-07-08 11:14:22 +02:00
padata.c padata: remove padata_parallel_queue 2020-07-23 17:34:18 +10:00
panic.c bug: Annotate WARN/BUG/stackfail as noinstr safe 2020-06-11 15:14:36 +02:00
params.c
pid_namespace.c pid_namespace: use checkpoint_restore_ns_capable() for ns_last_pid 2020-07-19 20:14:42 +02:00
pid.c cap-checkpoint-restore-v5.9 2020-08-04 15:02:07 -07:00
profile.c
ptrace.c
range.c
reboot.c arch: remove unicore32 port 2020-07-01 12:09:13 +03:00
relay.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
resource.c
rseq.c
scs.c scs: Report SCS usage in bytes rather than number of entries 2020-06-04 16:14:56 +01:00
seccomp.c seccomp: Introduce addfd ioctl to seccomp user notifier 2020-07-14 16:29:42 -07:00
signal.c signal: fix typo in dequeue_synchronous_signal() 2020-07-26 23:57:52 +02:00
smp.c smp: Fix a potential usage of stale nr_cpus 2020-07-22 10:22:04 +02:00
smpboot.c
smpboot.h
softirq.c tasklets API update for v5.9-rc1 2020-08-04 13:40:35 -07:00
stackleak.c gcc-plugins/stackleak: Use asm instrumentation to avoid useless register saving 2020-06-24 07:48:28 -07:00
stacktrace.c
stop_machine.c
sys_ni.c
sys.c prctl: exe link permission error changed from -EINVAL to -EPERM 2020-07-19 20:14:42 +02:00
sysctl_binary.c
sysctl-test.c
sysctl.c sched/uclamp: Add a new sysctl to control RT default boost value 2020-07-29 13:51:47 +02:00
task_work.c task_work: teach task_work_add() to do signal_wake_up() 2020-06-30 12:18:08 -06:00
taskstats.c
test_kprobes.c
torture.c torture: Dump ftrace at shutdown only if requested 2020-06-29 12:01:45 -07:00
tracepoint.c
tsacct.c
ucount.c
uid16.c
uid16.h
umh.c exec: Implement kernel_execve 2020-07-21 08:24:52 -05:00
up.c
user_namespace.c
user-return-notifier.c
user.c user.c: make uidhash_table static 2020-06-04 19:06:24 -07:00
usermode_driver.c umd: Stop using split_argv 2020-07-07 11:58:59 -05:00
utsname_sysctl.c
utsname.c
watch_queue.c Notifications over pipes + Keyring notifications 2020-06-13 09:56:21 -07:00
watchdog_hld.c
watchdog.c kernel/watchdog.c: convert {soft/hard}lockup boot parameters to sysctl aliases 2020-06-08 11:05:56 -07:00
workqueue_internal.h
workqueue.c maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault 2020-06-17 10:57:41 -07:00