kernel_optimize_test/kernel/locking
Byungchul Park d141babe42 locking/lockdep: Add a boot parameter allowing unwind in cross-release and disable it by default
Johan Hovold reported a heavy performance regression caused by lockdep
cross-release:

 > Boot time (from "Linux version" to login prompt) had in fact doubled
 > since 4.13 where it took 17 seconds (with my current config) compared to
 > the 35 seconds I now see with 4.14-rc4.
 >
 > I quick bisect pointed to lockdep and specifically the following commit:
 >
 >	28a903f63e ("locking/lockdep: Handle non(or multi)-acquisition
 >	               of a crosslock")
 >
 > which I've verified is the commit which doubled the boot time (compared
 > to 28a903f63ec0^) (added by lockdep crossrelease series [1]).

Currently cross-release performs unwind on every acquisition, but that
is very expensive.

This patch makes unwind optional and disables it by default and only
records acquire_ip.

Full stack traces are sometimes required for full analysis, in which
case a boot paramter, crossrelease_fullstack, can be specified.

On my qemu Ubuntu machine (x86_64, 4 cores, 512M), the regression was
fixed. We measure boot times with 'perf stat --null --repeat 10 $QEMU',
where $QEMU launches a kernel with init=/bin/true:

1. No lockdep enabled:

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       2.756558155 seconds time elapsed                    ( +-  0.09% )

2. Lockdep enabled:

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       2.968710420 seconds time elapsed                    ( +-  0.12% )

3. Lockdep enabled + cross-release enabled:

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       3.153839636 seconds time elapsed                    ( +-  0.31% )

4. Lockdep enabled + cross-release enabled + this patch applied:

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       2.963669551 seconds time elapsed                    ( +-  0.11% )

I.e. lockdep cross-release performance is now indistinguishable from
vanilla lockdep.

Bisected-by: Johan Hovold <johan@kernel.org>
Analyzed-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: amir73il@gmail.com
Cc: axboe@kernel.dk
Cc: darrick.wong@oracle.com
Cc: david@fromorbit.com
Cc: hch@infradead.org
Cc: idryomov@gmail.com
Cc: johannes.berg@intel.com
Cc: kernel-team@lge.com
Cc: linux-block@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-xfs@vger.kernel.org
Cc: oleg@redhat.com
Cc: tj@kernel.org
Link: http://lkml.kernel.org/r/1508921765-15396-5-git-send-email-byungchul.park@lge.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-25 12:19:01 +02:00
..
lockdep_internals.h locking/lockdep: Avoid creating redundant links 2017-08-10 12:29:04 +02:00
lockdep_proc.c locking/lockdep: Avoid creating redundant links 2017-08-10 12:29:04 +02:00
lockdep_states.h locking/lockdep: Rework FS_RECLAIM annotation 2017-08-10 12:29:03 +02:00
lockdep.c locking/lockdep: Add a boot parameter allowing unwind in cross-release and disable it by default 2017-10-25 12:19:01 +02:00
locktorture.c sched/headers: Prepare for the removal of <linux/rtmutex.h> from <linux/sched.h> 2017-03-02 08:42:32 +01:00
Makefile locking/ww_mutex: Begin kselftests for ww_mutex 2017-01-14 11:37:14 +01:00
mcs_spinlock.h locking/core: Remove cpu_relax_lowlatency() users 2016-11-16 10:15:10 +01:00
mutex-debug.c locking/mutex: Rework mutex::owner 2016-10-25 11:31:50 +02:00
mutex-debug.h locking/mutex: Fix lockdep_assert_held() fail 2017-01-30 11:42:59 +01:00
mutex.c mutex, futex: adjust kernel-doc markups to generate ReST 2017-05-16 08:43:25 -03:00
mutex.h locking/mutex: Fix lockdep_assert_held() fail 2017-01-30 11:42:59 +01:00
osq_lock.c locking/osq_lock: Fix osq_lock queue corruption 2017-08-10 12:28:54 +02:00
percpu-rwsem.c locking/percpu-rwsem: Replace waitqueue with rcuwait 2017-01-14 11:14:35 +01:00
qrwlock.c locking/qrwlock: Prevent slowpath writers getting held up by fastpath 2017-10-25 10:57:25 +02:00
qspinlock_paravirt.h locking/pvqspinlock: Relax cmpxchg's to improve performance on some architectures 2017-08-29 15:14:38 +02:00
qspinlock_stat.h sched/headers: Prepare for new header dependencies before moving code to <linux/sched/clock.h> 2017-03-02 08:42:27 +01:00
qspinlock.c locking: Remove spin_unlock_wait() generic definitions 2017-08-17 08:08:58 -07:00
rtmutex_common.h locking/rtmutex: replace top-waiter and pi_waiters leftmost caching 2017-09-08 18:26:49 -07:00
rtmutex-debug.c locking/rtmutex: replace top-waiter and pi_waiters leftmost caching 2017-09-08 18:26:49 -07:00
rtmutex-debug.h rt_mutex: Add lockdep annotations 2017-06-08 10:35:49 +02:00
rtmutex.c locking/rtmutex: replace top-waiter and pi_waiters leftmost caching 2017-09-08 18:26:49 -07:00
rtmutex.h rt_mutex: Add lockdep annotations 2017-06-08 10:35:49 +02:00
rwsem-spinlock.c locking/rwsem-spinlock: Add killable versions of __down_read() 2017-08-10 12:28:55 +02:00
rwsem-xadd.c locking/rwsem-xadd: Fix missed wakeup due to reordering of load 2017-09-29 10:10:20 +02:00
rwsem.c locking/rwsem: Add down_read_killable() 2017-10-10 11:50:16 +02:00
rwsem.h locking/rwsem: Protect all writes to owner by WRITE_ONCE() 2016-06-08 15:16:59 +02:00
semaphore.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> 2017-03-02 08:42:34 +01:00
spinlock_debug.c locking/spinlock/debug: Remove spinlock lockup detection code 2017-02-10 09:09:49 +01:00
spinlock.c locking/core: Remove {read,spin,write}_can_lock() 2017-10-10 11:50:18 +02:00
test-ww_mutex.c mm: treewide: remove GFP_TEMPORARY allocation flag 2017-09-13 18:53:16 -07:00