kernel_optimize_test/arch/x86
Tony Luck 619d747c18 x86/mce: Avoid infinite loop for copy from user recovery
commit 81065b35e2486c024c7aa86caed452e1f01a59d4 upstream.

There are two cases for machine check recovery:

1) The machine check was triggered by ring3 (application) code.
   This is the simpler case. The machine check handler simply queues
   work to be executed on return to user. That code unmaps the page
   from all users and arranges to send a SIGBUS to the task that
   triggered the poison.

2) The machine check was triggered in kernel code that is covered by
   an exception table entry. In this case the machine check handler
   still queues a work entry to unmap the page, etc. but this will
   not be called right away because the #MC handler returns to the
   fix up code address in the exception table entry.

Problems occur if the kernel triggers another machine check before the
return to user processes the first queued work item.

Specifically, the work is queued using the ->mce_kill_me callback
structure in the task struct for the current thread. Attempting to queue
a second work item using this same callback results in a loop in the
linked list of work functions to call. So when the kernel does return to
user, it enters an infinite loop processing the same entry for ever.

There are some legitimate scenarios where the kernel may take a second
machine check before returning to the user.

1) Some code (e.g. futex) first tries a get_user() with page faults
   disabled. If this fails, the code retries with page faults enabled
   expecting that this will resolve the page fault.

2) Copy from user code retries a copy in byte-at-time mode to check
   whether any additional bytes can be copied.

On the other side of the fence are some bad drivers that do not check
the return value from individual get_user() calls and may access
multiple user addresses without noticing that some/all calls have
failed.

Fix by adding a counter (current->mce_count) to keep track of repeated
machine checks before task_work() is called. First machine check saves
the address information and calls task_work_add(). Subsequent machine
checks before that task_work call back is executed check that the address
is in the same page as the first machine check (since the callback will
offline exactly one page).

Expected worst case is four machine checks before moving on (e.g. one
user access with page faults disabled, then a repeat to the same address
with page faults enabled ... repeat in copy tail bytes). Just in case
there is some code that loops forever enforce a limit of 10.

 [ bp: Massage commit message, drop noinstr, fix typo, extend panic
   messages. ]

Fixes: 5567d11c21 ("x86/mce: Send #MC singal from task work")
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/YT/IJ9ziLqmtqEPu@agluck-desk2.amr.corp.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-22 12:28:07 +02:00
..
boot x86/boot/compressed/64: Check SEV encryption in the 32-bit boot-path 2021-05-26 12:06:57 +02:00
configs
crypto crypto: x86/curve25519 - fix cpu feature checking logic in mod_exit 2021-07-14 16:56:06 +02:00
entry x86/sev: Split up runtime #VC handler for correct state tracking 2021-07-14 16:56:09 +02:00
events perf/x86/amd/ibs: Extend PERF_PMU_CAP_NO_EXCLUDE to IBS Op 2021-09-15 09:50:47 +02:00
hyperv
ia32
include x86/uaccess: Fix 32-bit __get_user_asm_u64() when CC_HAS_ASM_GOTO_OUTPUT=y 2021-09-22 12:27:58 +02:00
kernel x86/mce: Avoid infinite loop for copy from user recovery 2021-09-22 12:28:07 +02:00
kvm KVM: nVMX: Unconditionally clear nested.pi_pending on nested VM-Enter 2021-09-15 09:50:48 +02:00
lib x86/msr: Fix wr/rdmsr_safe_regs_on_cpu() prototypes 2021-05-22 11:40:51 +02:00
math-emu
mm x86/mm: Fix kern_addr_valid() to cope with existing but not present entries 2021-09-22 12:27:56 +02:00
net bpf: Introduce BPF nospec instruction for mitigating Spectre v4 2021-08-04 12:46:44 +02:00
oprofile
pci PCI: Add AMD RS690 quirk to enable 64-bit DMA 2021-06-30 08:47:23 -04:00
platform
power PM: hibernate: x86: Use crc32 instead of md5 for hibernation e820 integrity check 2021-05-14 09:50:21 +02:00
purgatory
ras
realmode
tools x86/tools: Fix objdump version check again 2021-08-18 08:59:15 +02:00
um
video
xen xen: reset legacy rtc flag for PV domU 2021-09-22 12:27:54 +02:00
.gitignore
Kbuild
Kconfig x86/platform/uv: Fix !KEXEC build failure 2021-05-14 09:50:20 +02:00
Kconfig.assembler
Kconfig.cpu
Kconfig.debug
Makefile x86/build: Propagate $(CLANG_FLAGS) to $(REALMODE_FLAGS) 2021-05-11 14:47:18 +02:00
Makefile_32.cpu
Makefile.um