kernel_optimize_test/drivers
Davidlohr Bueso 642fa448ae sched/core: Remove set_task_state()
This is a nasty interface and setting the state of a foreign task must
not be done. As of the following commit:

  be628be095 ("bcache: Make gc wakeup sane, remove set_task_state()")

... everyone in the kernel calls set_task_state() with current, allowing
the helper to be removed.

However, as the comment indicates, it is still around for those archs
where computing current is more expensive than using a pointer, at least
in theory. An important arch that is affected is arm64, however this has
been addressed now [1] and performance is up to par making no difference
with either calls.

Of all the callers, if any, it's the locking bits that would care most
about this -- ie: we end up passing a tsk pointer to a lot of the lock
slowpath, and setting ->state on that. The following numbers are based
on two tests: a custom ad-hoc microbenchmark that just measures
latencies (for ~65 million calls) between get_task_state() vs
get_current_state().

Secondly for a higher overview, an unlink microbenchmark was used,
which pounds on a single file with open, close,unlink combos with
increasing thread counts (up to 4x ncpus). While the workload is quite
unrealistic, it does contend a lot on the inode mutex or now rwsem.

[1] https://lkml.kernel.org/r/1483468021-8237-1-git-send-email-mark.rutland@arm.com

== 1. x86-64 ==

Avg runtime set_task_state():    601 msecs
Avg runtime set_current_state(): 552 msecs

                                            vanilla                 dirty
Hmean    unlink1-processes-2      36089.26 (  0.00%)    38977.33 (  8.00%)
Hmean    unlink1-processes-5      28555.01 (  0.00%)    29832.55 (  4.28%)
Hmean    unlink1-processes-8      37323.75 (  0.00%)    44974.57 ( 20.50%)
Hmean    unlink1-processes-12     43571.88 (  0.00%)    44283.01 (  1.63%)
Hmean    unlink1-processes-21     34431.52 (  0.00%)    38284.45 ( 11.19%)
Hmean    unlink1-processes-30     34813.26 (  0.00%)    37975.17 (  9.08%)
Hmean    unlink1-processes-48     37048.90 (  0.00%)    39862.78 (  7.59%)
Hmean    unlink1-processes-79     35630.01 (  0.00%)    36855.30 (  3.44%)
Hmean    unlink1-processes-110    36115.85 (  0.00%)    39843.91 ( 10.32%)
Hmean    unlink1-processes-141    32546.96 (  0.00%)    35418.52 (  8.82%)
Hmean    unlink1-processes-172    34674.79 (  0.00%)    36899.21 (  6.42%)
Hmean    unlink1-processes-203    37303.11 (  0.00%)    36393.04 ( -2.44%)
Hmean    unlink1-processes-224    35712.13 (  0.00%)    36685.96 (  2.73%)

== 2. ppc64le ==

Avg runtime set_task_state():  938 msecs
Avg runtime set_current_state: 940 msecs

                                            vanilla                 dirty
Hmean    unlink1-processes-2      19269.19 (  0.00%)    30704.50 ( 59.35%)
Hmean    unlink1-processes-5      20106.15 (  0.00%)    21804.15 (  8.45%)
Hmean    unlink1-processes-8      17496.97 (  0.00%)    17243.28 ( -1.45%)
Hmean    unlink1-processes-12     14224.15 (  0.00%)    17240.21 ( 21.20%)
Hmean    unlink1-processes-21     14155.66 (  0.00%)    15681.23 ( 10.78%)
Hmean    unlink1-processes-30     14450.70 (  0.00%)    15995.83 ( 10.69%)
Hmean    unlink1-processes-48     16945.57 (  0.00%)    16370.42 ( -3.39%)
Hmean    unlink1-processes-79     15788.39 (  0.00%)    14639.27 ( -7.28%)
Hmean    unlink1-processes-110    14268.48 (  0.00%)    14377.40 (  0.76%)
Hmean    unlink1-processes-141    14023.65 (  0.00%)    16271.69 ( 16.03%)
Hmean    unlink1-processes-172    13417.62 (  0.00%)    16067.55 ( 19.75%)
Hmean    unlink1-processes-203    15293.08 (  0.00%)    15440.40 (  0.96%)
Hmean    unlink1-processes-234    13719.32 (  0.00%)    16190.74 ( 18.01%)
Hmean    unlink1-processes-265    16400.97 (  0.00%)    16115.22 ( -1.74%)
Hmean    unlink1-processes-296    14388.60 (  0.00%)    16216.13 ( 12.70%)
Hmean    unlink1-processes-320    15771.85 (  0.00%)    15905.96 (  0.85%)

x86-64 (known to be fast for get_current()/this_cpu_read_stable() caching)
and ppc64 (with paca) show similar improvements in the unlink microbenches.
The small delta for ppc64 (2ms), does not represent the gains on the unlink
runs. In the case of x86, there was a decent amount of variation in the
latency runs, but always within a 20 to 50ms increase), ppc was more constant.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave@stgolabs.net
Cc: mark.rutland@arm.com
Link: http://lkml.kernel.org/r/1483479794-14013-5-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-14 11:14:16 +01:00
..
accessibility
acpi Merge branches 'acpi-scan', 'acpi-sysfs', 'acpi-wdat' and 'acpi-tables' 2017-01-06 14:36:30 +01:00
amba
android
ata Merge branch 'for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata 2016-12-13 15:30:50 -08:00
atm Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
auxdisplay
base PM / domains: Fix 'may be used uninitialized' build warning 2016-12-31 21:52:07 +01:00
bcma
block zram: support BDI_CAP_STABLE_WRITES 2017-01-10 18:31:55 -08:00
bluetooth Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-12-16 10:24:44 -08:00
bus cpu/hotplug: Cleanup state names 2016-12-25 10:47:44 +01:00
cdrom Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
char clocksource: Use a plain u64 instead of cycle_t 2016-12-25 11:04:12 +01:00
clk One fix for a broken driver on Renesas RZ/A1 SoCs with bootloaders that don't 2017-01-06 15:35:27 -08:00
clocksource Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-12-25 14:30:04 -08:00
connector
cpufreq Merge branch 'pm-cpufreq' 2017-01-06 14:34:52 +01:00
cpuidle Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2016-12-27 17:51:36 -08:00
dax libnvdimm for 4.10 2016-12-18 15:49:10 -08:00
dca
devfreq PM / devfreq: exynos-bus: Fix the wrong return value 2017-01-03 00:21:45 +01:00
dio Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
dma ktime: Cleanup ktime_set() usage 2016-12-25 17:21:22 +01:00
dma-buf
edac Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
eisa
extcon sound updates for 4.10-rc1 2016-12-14 11:14:28 -08:00
firewire
firmware PSCI fixes for v4.10 2017-01-04 16:38:39 +01:00
fmc
fpga
gpio gpio: Move freeing of GPIO hogs before numbing of the device 2016-12-30 09:11:21 +01:00
gpu Merge branch 'drm-fixes-4.10' of git://people.freedesktop.org/~agd5f/linux into drm-fixes 2017-01-09 09:47:19 +10:00
hid HID: sensor-hub: Move the memset to sensor_hub_get_feature() 2017-01-02 14:01:30 +01:00
hsi
hv clocksource: Use a plain u64 instead of cycle_t 2016-12-25 11:04:12 +01:00
hwmon hwmon: (lm90) fix temp1_max_alarm attribute 2017-01-02 10:15:28 -08:00
hwspinlock
hwtracing coresight/etm3/4x: Consolidate hotplug state space 2016-12-25 10:47:44 +01:00
i2c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
ide Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
idle Power management material for v4.10-rc1 2016-12-13 10:41:53 -08:00
iio First round of IIO fixes for the 4.10 cycle. 2017-01-02 16:59:44 +01:00
infiniband net/mlx4_core: Fix raw qp flow steering rules under SRIOV 2016-12-29 14:17:40 -05:00
input ktime: Cleanup ktime_set() usage 2016-12-25 17:21:22 +01:00
iommu IOMMU Fixes for Linux v4.10-rc2 2017-01-06 10:49:36 -08:00
ipack
irqchip Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-12-25 14:30:04 -08:00
isdn Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
leds cpu/hotplug: Cleanup state names 2016-12-25 10:47:44 +01:00
lguest Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
lightnvm Char/Misc driver patches for 4.10-rc1 2016-12-13 12:11:01 -08:00
macintosh Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
mailbox ktime: Cleanup ktime_set() usage 2016-12-25 17:21:22 +01:00
mcb
md sched/core: Remove set_task_state() 2017-01-14 11:14:16 +01:00
media ktime: Cleanup ktime_set() usage 2016-12-25 17:21:22 +01:00
memory
memstick Merge branch 'for-4.10/block' of git://git.kernel.dk/linux-block 2016-12-13 10:19:16 -08:00
message Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
mfd - New Device Support 2016-12-19 08:16:26 -08:00
misc mei: move write cb to completion on credentials failures 2017-01-04 18:22:44 +01:00
mmc Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
mtd Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
net Merge branch 'akpm' (patches from Andrew) 2017-01-11 11:15:15 -08:00
nfc
ntb ntb_transport: Remove unnecessary call to ntb_peer_spad_read 2016-12-23 16:11:07 -05:00
nubus Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
nvdimm libnvdimm for 4.10 2016-12-18 15:49:10 -08:00
nvme Merge branch 'nvme-4.10' of git://git.infradead.org/nvme into for-linus 2016-12-22 11:54:46 -07:00
nvmem nvmem: fix nvmem_cell_read() return type doc 2017-01-04 18:22:47 +01:00
of pci-v4.10-changes 2016-12-15 12:46:48 -08:00
oprofile Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
parisc Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
parport Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
pci ktime: Cleanup ktime_set() usage 2016-12-25 17:21:22 +01:00
pcmcia drivers/pcmcia/m32r_pcc.c: check return from add_pcc_socket 2016-12-12 18:55:06 -08:00
perf cpu/hotplug: Cleanup state names 2016-12-25 10:47:44 +01:00
phy SCSI misc on 20161213 2016-12-14 10:49:33 -08:00
pinctrl pinctrl: samsung: Fix the width of PINCFG_TYPE_DRV bitfields for Exynos5433 2016-12-30 14:27:42 +01:00
platform platform/x86: fujitsu-laptop: use brightness_set_blocking for LED-setting callbacks 2016-12-28 12:38:10 +02:00
pnp Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
power ktime: Cleanup ktime_set() usage 2016-12-25 17:21:22 +01:00
powercap
pps
ps3
ptp Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-12-12 19:56:15 -08:00
pwm pwm: Changes for v4.10-rc1 2016-12-15 11:45:13 -08:00
rapidio
ras
regulator - New Device Support 2016-12-19 08:16:26 -08:00
remoteproc remoteproc: qcom_adsp_pil: select qcom_scm 2016-12-09 16:16:56 -08:00
reset ARM: SoC driver updates for v4.10 2016-12-15 16:03:25 -08:00
rpmsg rpmsg updates for v4.10 2016-12-13 08:52:45 -08:00
rtc ktime: Cleanup ktime_set() usage 2016-12-25 17:21:22 +01:00
s390 ktime: Cleanup ktime_set() usage 2016-12-25 17:21:22 +01:00
sbus Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
scsi Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-12-25 14:30:04 -08:00
sfi
sh lib: radix-tree: check accounting of existing slot replacement users 2016-12-12 18:55:08 -08:00
sn
soc powerpc updates for 4.10 2016-12-16 09:26:42 -08:00
spi dmaengine updates for 4.10-rc1 2016-12-14 20:42:45 -08:00
spmi
ssb
staging sched/core: Remove set_task_state() 2017-01-14 11:14:16 +01:00
target Merge branch 'scsi-target-for-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/bvanassche/linux 2016-12-21 10:16:05 -08:00
tc
thermal Power management material for v4.10-rc1 2016-12-13 10:41:53 -08:00
thunderbolt Char/Misc driver patches for 4.10-rc1 2016-12-13 12:11:01 -08:00
tty sched/core: Remove set_task_state() 2017-01-14 11:14:16 +01:00
uio uio-hv-generic: store physical addresses instead of virtual 2016-12-10 14:57:58 +01:00
usb USB: fix problems with duplicate endpoint addresses 2017-01-05 19:38:40 +01:00
uwb
vfio vfio-pci: Handle error from pci_iomap 2017-01-04 08:34:39 -07:00
vhost Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-12-16 10:24:44 -08:00
video video: fbdev: cobalt_lcdfb: Handle return NULL error from devm_ioremap 2017-01-04 12:58:45 +01:00
virt
virtio virtio_mmio: Set dev.release() to avoid warning 2016-12-16 00:13:39 +02:00
vlynq
vme
w1
watchdog Watchdog updates for v4.10 2016-12-24 11:27:45 -08:00
xen Merge branch 'stable/for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb 2017-01-06 10:53:21 -08:00
zorro Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
Kconfig
Makefile