kernel_optimize_test/kernel/locking
Waiman Long 34d54f3d69 locking/pvqspinlock: Relax cmpxchg's to improve performance on some architectures
All the locking related cmpxchg's in the following functions are
replaced with the _acquire variants:

 - pv_queued_spin_steal_lock()
 - trylock_clear_pending()

This change should help performance on architectures that use LL/SC.

The cmpxchg in pv_kick_node() is replaced with a relaxed version
with explicit memory barrier to make sure that it is fully ordered
in the writing of next->lock and the reading of pn->state whether
the cmpxchg is a success or failure without affecting performance in
non-LL/SC architectures.

On a 2-socket 12-core 96-thread Power8 system with pvqspinlock
explicitly enabled, the performance of a locking microbenchmark
with and without this patch on a 4.13-rc4 kernel with Xinhui's PPC
qspinlock patch were as follows:

  # of thread     w/o patch    with patch      % Change
  -----------     ---------    ----------      --------
       8         5054.8 Mop/s  5209.4 Mop/s     +3.1%
      16         3985.0 Mop/s  4015.0 Mop/s     +0.8%
      32         2378.2 Mop/s  2396.0 Mop/s     +0.7%

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pan Xinhui <xinhui@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/1502741222-24360-1-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29 15:14:38 +02:00
..
lockdep_internals.h locking/lockdep: Avoid creating redundant links 2017-08-10 12:29:04 +02:00
lockdep_proc.c locking/lockdep: Avoid creating redundant links 2017-08-10 12:29:04 +02:00
lockdep_states.h locking/lockdep: Rework FS_RECLAIM annotation 2017-08-10 12:29:03 +02:00
lockdep.c locking/lockdep: Untangle xhlock history save/restore from task independence 2017-08-29 15:14:38 +02:00
locktorture.c sched/headers: Prepare for the removal of <linux/rtmutex.h> from <linux/sched.h> 2017-03-02 08:42:32 +01:00
Makefile locking/ww_mutex: Begin kselftests for ww_mutex 2017-01-14 11:37:14 +01:00
mcs_spinlock.h
mutex-debug.c
mutex-debug.h locking/mutex: Fix lockdep_assert_held() fail 2017-01-30 11:42:59 +01:00
mutex.c mutex, futex: adjust kernel-doc markups to generate ReST 2017-05-16 08:43:25 -03:00
mutex.h locking/mutex: Fix lockdep_assert_held() fail 2017-01-30 11:42:59 +01:00
osq_lock.c locking/osq_lock: Fix osq_lock queue corruption 2017-08-10 12:28:54 +02:00
percpu-rwsem.c locking/percpu-rwsem: Replace waitqueue with rcuwait 2017-01-14 11:14:35 +01:00
qrwlock.c kernel/locking: Fix compile error with qrwlock.c 2017-05-25 12:06:50 -07:00
qspinlock_paravirt.h locking/pvqspinlock: Relax cmpxchg's to improve performance on some architectures 2017-08-29 15:14:38 +02:00
qspinlock_stat.h sched/headers: Prepare for new header dependencies before moving code to <linux/sched/clock.h> 2017-03-02 08:42:27 +01:00
qspinlock.c locking/qspinlock: Explicitly include asm/prefetch.h 2017-07-08 11:01:11 +02:00
rtmutex_common.h futex: Allow for compiling out PI support 2017-08-01 14:36:35 +02:00
rtmutex-debug.c rt_mutex: Add lockdep annotations 2017-06-08 10:35:49 +02:00
rtmutex-debug.h rt_mutex: Add lockdep annotations 2017-06-08 10:35:49 +02:00
rtmutex.c locking/rtmutex: Remove unnecessary priority adjustment 2017-07-13 11:44:06 +02:00
rtmutex.h rt_mutex: Add lockdep annotations 2017-06-08 10:35:49 +02:00
rwsem-spinlock.c locking/rwsem-spinlock: Add killable versions of __down_read() 2017-08-10 12:28:55 +02:00
rwsem-xadd.c locking/rwsem-xadd: Add killable versions of rwsem_down_read_failed() 2017-08-10 12:28:55 +02:00
rwsem.c locking/lockdep: Add new check to lock_downgrade() 2017-03-16 09:57:07 +01:00
rwsem.h
semaphore.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> 2017-03-02 08:42:34 +01:00
spinlock_debug.c locking/spinlock/debug: Remove spinlock lockup detection code 2017-02-10 09:09:49 +01:00
spinlock.c
test-ww_mutex.c locking/ww-mutex: Limit stress test to 2 seconds 2017-03-30 09:49:47 +02:00