kernel_optimize_test

History

Waiman Long fd99aeb978 clocksource: Avoid accidental unstable marking of clocksources [ Upstream commit c86ff8c55b8ae68837b2fa59dc0c203907e9a15f ] Since commit db3a34e17433 ("clocksource: Retry clock read if long delays detected") and commit 2e27e793e280 ("clocksource: Reduce clocksource-skew threshold"), it is found that tsc clocksource fallback to hpet can sometimes happen on both Intel and AMD systems especially when they are running stressful benchmarking workloads. Of the 23 systems tested with a v5.14 kernel, 10 of them have switched to hpet clock source during the test run. The result of falling back to hpet is a drastic reduction of performance when running benchmarks. For example, the fio performance tests can drop up to 70% whereas the iperf3 performance can drop up to 80%. 4 hpet fallbacks happened during bootup. They were: [ 8.749399] clocksource: timekeeping watchdog on CPU13: hpet read-back delay of 263750ns, attempt 4, marking unstable [ 12.044610] clocksource: timekeeping watchdog on CPU19: hpet read-back delay of 186166ns, attempt 4, marking unstable [ 17.336941] clocksource: timekeeping watchdog on CPU28: hpet read-back delay of 182291ns, attempt 4, marking unstable [ 17.518565] clocksource: timekeeping watchdog on CPU34: hpet read-back delay of 252196ns, attempt 4, marking unstable Other fallbacks happen when the systems were running stressful benchmarks. For example: [ 2685.867873] clocksource: timekeeping watchdog on CPU117: hpet read-back delay of 57269ns, attempt 4, marking unstable [46215.471228] clocksource: timekeeping watchdog on CPU8: hpet read-back delay of 61460ns, attempt 4, marking unstable Commit 2e27e793e280 ("clocksource: Reduce clocksource-skew threshold"), changed the skew margin from 100us to 50us. I think this is too small and can easily be exceeded when running some stressful workloads on a thermally stressed system. So it is switched back to 100us. Even a maximum skew margin of 100us may be too small in for some systems when booting up especially if those systems are under thermal stress. To eliminate the case that the large skew is due to the system being too busy slowing down the reading of both the watchdog and the clocksource, an extra consecutive read of watchdog clock is being done to check this. The consecutive watchdog read delay is compared against WATCHDOG_MAX_SKEW/2. If the delay exceeds the limit, we assume that the system is just too busy. A warning will be printed to the console and the clock skew check is skipped for this round. Fixes: db3a34e17433 ("clocksource: Retry clock read if long delays detected") Fixes: 2e27e793e280 ("clocksource: Reduce clocksource-skew threshold") Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>		2022-01-27 10:54:06 +01:00
..
alarmtimer.c	kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data()	2021-03-25 09:04:16 +01:00
clockevents.c
clocksource.c	clocksource: Avoid accidental unstable marking of clocksources	2022-01-27 10:54:06 +01:00
hrtimer.c	hrtimer: Ensure timerfd notification for HIGHRES=n	2021-09-15 09:50:25 +02:00
itimer.c	time: Prevent undefined behaviour in timespec64_to_ns()	2020-10-26 11:48:11 +01:00
jiffies.c	clocksource: Reduce clocksource-skew threshold	2022-01-27 10:54:05 +01:00
Kconfig	posix-cpu-timers: Provide mechanisms to defer timer handling to task_work	2020-08-06 16:50:59 +02:00
Makefile	ns: Introduce Time Namespace	2020-01-14 12:20:48 +01:00
namespace.c	nsproxy: support CLONE_NEWTIME with setns()	2020-07-08 11:14:22 +02:00
ntp_internal.h
ntp.c	ntp/y2038: Remove incorrect time_t truncation	2019-11-12 08:13:44 +01:00
posix-clock.c	posix-clocks: Rename the clock_get() callback to clock_get_timespec()	2020-01-14 12:20:49 +01:00
posix-cpu-timers.c	posix-cpu-timers: Clear task::posix_cputimers_work in copy_process()	2021-11-18 14:04:29 +01:00
posix-stubs.c	posix-timers: Make clock_nanosleep() time namespace aware	2020-01-14 12:20:55 +01:00
posix-timers.c	posix-timers: Preserve return value in clock_adjtime32()	2021-05-11 14:47:16 +02:00
posix-timers.h	posix-clocks: Introduce clock_get_ktime() callback	2020-01-14 12:20:51 +01:00
sched_clock.c	time/sched_clock: Mark sched_clock_read_begin/retry() as notrace	2020-10-26 11:34:31 +01:00
test_udelay.c
tick-broadcast-hrtimer.c	tick: broadcast-hrtimer: Fix a race in bc_set_next	2019-09-27 14:45:55 +02:00
tick-broadcast.c	treewide: Use fallthrough pseudo-keyword	2020-08-23 17:36:59 -05:00
tick-common.c	timekeeping: Split jiffies seqlock	2020-03-21 16:00:23 +01:00
tick-internal.h	hrtimer: Ensure timerfd notification for HIGHRES=n	2021-09-15 09:50:25 +02:00
tick-oneshot.c
tick-sched.c	tick/sched: Remove bogus boot "safety" check	2021-01-06 14:56:55 +01:00
tick-sched.h
time.c	y2038: remove unused time32 interfaces	2020-02-21 11:22:15 -08:00
timeconst.bc
timeconv.c
timecounter.c
timekeeping_debug.c
timekeeping_internal.h	timekeeping/vsyscall: Provide vdso_update_begin/end()	2020-08-06 10:57:30 +02:00
timekeeping.c	timekeeping: Really make sure wall_to_monotonic isn't positive	2021-12-22 09:30:58 +01:00
timekeeping.h	timekeeping: Split jiffies seqlock	2020-03-21 16:00:23 +01:00
timer_list.c	timer_list: Guard procfs specific code	2019-06-23 00:08:52 +02:00
timer.c	timers: Move clearing of base::timer_running under base:: Lock	2021-08-12 13:22:15 +02:00
vsyscall.c	timekeeping/vsyscall: Provide vdso_update_begin/end()	2020-08-06 10:57:30 +02:00