kernel_optimize_test/kernel/sched
Zhang Qiao c85c6fadbe kernel/sched: Fix sched_fork() access an invalid sched_task_group
[ Upstream commit 4ef0c5c6b5ba1f38f0ea1cedad0cad722f00c14a ]

There is a small race between copy_process() and sched_fork()
where child->sched_task_group point to an already freed pointer.

	parent doing fork()      | someone moving the parent
				 | to another cgroup
  -------------------------------+-------------------------------
  copy_process()
      + dup_task_struct()<1>
				  parent move to another cgroup,
				  and free the old cgroup. <2>
      + sched_fork()
	+ __set_task_cpu()<3>
	+ task_fork_fair()
	  + sched_slice()<4>

In the worst case, this bug can lead to "use-after-free" and
cause panic as shown above:

  (1) parent copy its sched_task_group to child at <1>;

  (2) someone move the parent to another cgroup and free the old
      cgroup at <2>;

  (3) the sched_task_group and cfs_rq that belong to the old cgroup
      will be accessed at <3> and <4>, which cause a panic:

  [] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
  [] PGD 8000001fa0a86067 P4D 8000001fa0a86067 PUD 2029955067 PMD 0
  [] Oops: 0000 [#1] SMP PTI
  [] CPU: 7 PID: 648398 Comm: ebizzy Kdump: loaded Tainted: G           OE    --------- -  - 4.18.0.x86_64+ #1
  [] RIP: 0010:sched_slice+0x84/0xc0

  [] Call Trace:
  []  task_fork_fair+0x81/0x120
  []  sched_fork+0x132/0x240
  []  copy_process.part.5+0x675/0x20e0
  []  ? __handle_mm_fault+0x63f/0x690
  []  _do_fork+0xcd/0x3b0
  []  do_syscall_64+0x5d/0x1d0
  []  entry_SYSCALL_64_after_hwframe+0x65/0xca
  [] RIP: 0033:0x7f04418cd7e1

Between cgroup_can_fork() and cgroup_post_fork(), the cgroup
membership and thus sched_task_group can't change. So update child's
sched_task_group at sched_post_fork() and move task_fork() and
__set_task_cpu() (where accees the sched_task_group) from sched_fork()
to sched_post_fork().

Fixes: 8323f26ce3 ("sched: Fix race in task_group")
Signed-off-by: Zhang Qiao <zhangqiao22@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lkml.kernel.org/r/20210915064030.2231-1-zhangqiao22@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18 14:04:07 +01:00
..
autogroup.c
autogroup.h
clock.c
completion.c completion: Use lockdep_assert_RT_in_threaded_ctx() in complete_all() 2020-03-23 18:40:25 +01:00
core.c kernel/sched: Fix sched_fork() access an invalid sched_task_group 2021-11-18 14:04:07 +01:00
cpuacct.c sched/cpuacct: Fix charge cpuacct.usage_sys 2020-05-19 20:34:14 +02:00
cpudeadline.c sched/deadline: Implement fallback mechanism for !fit case 2020-06-15 14:10:05 +02:00
cpudeadline.h
cpufreq_schedutil.c cpufreq: schedutil: Use kobject release() method to free sugov_tunables 2021-10-06 15:55:45 +02:00
cpufreq.c
cpupri.c sched/rt: cpupri_find: Trigger a full search as fallback 2020-03-20 13:06:20 +01:00
cpupri.h sched/rt: Optimize cpupri_find() on non-heterogenous systems 2020-03-06 12:57:27 +01:00
cputime.c sched/cputime: Improve cputime_adjust() 2020-06-15 14:10:00 +02:00
deadline.c sched/deadline: Fix missing clock update in migrate_task_rq_dl() 2021-09-15 09:50:24 +02:00
debug.c sched/fair: Fix util_est UTIL_AVG_UNCHANGED handling 2021-06-16 12:01:46 +02:00
fair.c sched/numa: Fix is_core_idle() 2021-09-15 09:50:28 +02:00
features.h sched,fair: Alternative sched_slice() 2021-05-11 14:47:31 +02:00
idle.c sched/idle: Make the idle timer expire in hard interrupt context 2021-09-26 14:09:02 +02:00
isolation.c isolcpus: Affine unbound kernel threads to housekeeping cpus 2020-06-15 14:10:03 +02:00
loadavg.c sched: nohz: stop passing around unused "ticks" parameter. 2020-07-22 10:22:04 +02:00
Makefile
membarrier.c sched/membarrier: fix missing local execution of ipi_sync_rq_state() 2021-03-17 17:06:35 +01:00
pelt.c sched: Add a tracepoint to track rq->nr_running 2020-07-08 11:39:02 +02:00
pelt.h sched/fair: Fix util_est UTIL_AVG_UNCHANGED handling 2021-06-16 12:01:46 +02:00
psi.c psi: Fix race between psi_trigger_create/destroy 2021-07-14 16:56:10 +02:00
rt.c sched/rt: Fix RT utilization tracking during policy change 2021-07-14 16:56:09 +02:00
sched-pelt.h
sched.h sched/deadline: Fix reset_on_fork reporting of DL tasks 2021-09-15 09:50:24 +02:00
smp.h sched/headers: Split out open-coded prototypes into kernel/sched/smp.h 2020-05-28 11:03:20 +02:00
stats.c
stats.h psi: Move PF_MEMSTALL out of task->flags 2020-03-20 13:06:19 +01:00
stop_task.c treewide: Convert macro and uses of __section(foo) to __section("foo") 2020-10-25 14:51:49 -07:00
swait.c sched/swait: Prepare usage in completions 2020-03-21 16:00:23 +01:00
topology.c Scheduler changes for v5.10: 2020-10-12 12:56:01 -07:00
wait_bit.c
wait.c rq-qos: fix missed wake-ups in rq_qos_throttle try two 2021-07-19 09:45:00 +02:00