Add a bare tracepoint trace_sched_update_nr_running_tp which tracks
->nr_running CPU's rq. This is used to accurately trace this data and
provide a visualization of scheduler imbalances in, for example, the
form of a heat map. The tracepoint is accessed by loading an external
kernel module. An example module (forked from Qais' module and including
the pelt related tracepoints) can be found at:
https://github.com/auldp/tracepoints-helpers.git
A script to turn the trace-cmd report output into a heatmap plot can be
found at:
https://github.com/jirvoz/plot-nr-running
The tracepoints are added to add_nr_running() and sub_nr_running() which
are in kernel/sched/sched.h. In order to avoid CREATE_TRACE_POINTS in
the header a wrapper call is used and the trace/events/sched.h include
is moved before sched.h in kernel/sched/core.
Signed-off-by: Phil Auld <pauld@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200629192303.GC120228@lorien.usersys.redhat.com
With the existing implementation of store_rps_map(), packets are queued
in the receive path on the backlog queues of other CPUs irrespective of
whether they are isolated or not. This could add a latency overhead to
any RT workload that is running on the same CPU.
Ensure that store_rps_map() only uses available housekeeping CPUs for
storing the rps_map.
Signed-off-by: Alex Belits <abelits@marvell.com>
Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200625223443.2684-4-nitesh@redhat.com
pci_call_probe() prevents the nesting of work_on_cpu() for a scenario
where a VF device is probed from work_on_cpu() of the PF.
Replace the cpumask used in pci_call_probe() from all online CPUs to only
housekeeping CPUs. This is to ensure that there are no additional latency
overheads caused due to the pinning of jobs on isolated CPUs.
Signed-off-by: Alex Belits <abelits@marvell.com>
Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lkml.kernel.org/r/20200625223443.2684-3-nitesh@redhat.com
The current implementation of cpumask_local_spread() does not respect the
isolated CPUs, i.e., even if a CPU has been isolated for Real-Time task,
it will return it to the caller for pinning of its IRQ threads. Having
these unwanted IRQ threads on an isolated CPU adds up to a latency
overhead.
Restrict the CPUs that are returned for spreading IRQs only to the
available housekeeping CPUs.
Signed-off-by: Alex Belits <abelits@marvell.com>
Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200625223443.2684-2-nitesh@redhat.com
There is a report that when uclamp is enabled, a netperf UDP test
regresses compared to a kernel compiled without uclamp.
https://lore.kernel.org/lkml/20200529100806.GA3070@suse.de/
While investigating the root cause, there were no sign that the uclamp
code is doing anything particularly expensive but could suffer from bad
cache behavior under certain circumstances that are yet to be
understood.
https://lore.kernel.org/lkml/20200616110824.dgkkbyapn3io6wik@e107158-lin/
To reduce the pressure on the fast path anyway, add a static key that is
by default will skip executing uclamp logic in the
enqueue/dequeue_task() fast path until it's needed.
As soon as the user start using util clamp by:
1. Changing uclamp value of a task with sched_setattr()
2. Modifying the default sysctl_sched_util_clamp_{min, max}
3. Modifying the default cpu.uclamp.{min, max} value in cgroup
We flip the static key now that the user has opted to use util clamp.
Effectively re-introducing uclamp logic in the enqueue/dequeue_task()
fast path. It stays on from that point forward until the next reboot.
This should help minimize the effect of util clamp on workloads that
don't need it but still allow distros to ship their kernels with uclamp
compiled in by default.
SCHED_WARN_ON() in uclamp_rq_dec_id() was removed since now we can end
up with unbalanced call to uclamp_rq_dec_id() if we flip the key while
a task is running in the rq. Since we know it is harmless we just
quietly return if we attempt a uclamp_rq_dec_id() when
rq->uclamp[].bucket[].tasks is 0.
In schedutil, we introduce a new uclamp_is_enabled() helper which takes
the static key into account to ensure RT boosting behavior is retained.
The following results demonstrates how this helps on 2 Sockets Xeon E5
2x10-Cores system.
nouclamp uclamp uclamp-static-key
Hmean send-64 162.43 ( 0.00%) 157.84 * -2.82%* 163.39 * 0.59%*
Hmean send-128 324.71 ( 0.00%) 314.78 * -3.06%* 326.18 * 0.45%*
Hmean send-256 641.55 ( 0.00%) 628.67 * -2.01%* 648.12 * 1.02%*
Hmean send-1024 2525.28 ( 0.00%) 2448.26 * -3.05%* 2543.73 * 0.73%*
Hmean send-2048 4836.14 ( 0.00%) 4712.08 * -2.57%* 4867.69 * 0.65%*
Hmean send-3312 7540.83 ( 0.00%) 7425.45 * -1.53%* 7621.06 * 1.06%*
Hmean send-4096 9124.53 ( 0.00%) 8948.82 * -1.93%* 9276.25 * 1.66%*
Hmean send-8192 15589.67 ( 0.00%) 15486.35 * -0.66%* 15819.98 * 1.48%*
Hmean send-16384 26386.47 ( 0.00%) 25752.25 * -2.40%* 26773.74 * 1.47%*
The perf diff between nouclamp and uclamp-static-key when uclamp is
disabled in the fast path:
8.73% -1.55% [kernel.kallsyms] [k] try_to_wake_up
0.07% +0.04% [kernel.kallsyms] [k] deactivate_task
0.13% -0.02% [kernel.kallsyms] [k] activate_task
The diff between nouclamp and uclamp-static-key when uclamp is enabled
in the fast path:
8.73% -0.72% [kernel.kallsyms] [k] try_to_wake_up
0.13% +0.39% [kernel.kallsyms] [k] activate_task
0.07% +0.38% [kernel.kallsyms] [k] deactivate_task
Fixes: 69842cba9a ("sched/uclamp: Add CPU's clamp buckets refcounting")
Reported-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lkml.kernel.org/r/20200630112123.12076-3-qais.yousef@arm.com
struct uclamp_rq was zeroed out entirely in assumption that in the first
call to uclamp_rq_inc() they'd be initialized correctly in accordance to
default settings.
But when next patch introduces a static key to skip
uclamp_rq_{inc,dec}() until userspace opts in to use uclamp, schedutil
will fail to perform any frequency changes because the
rq->uclamp[UCLAMP_MAX].value is zeroed at init and stays as such. Which
means all rqs are capped to 0 by default.
Fix it by making sure we do proper initialization at init without
relying on uclamp_rq_inc() doing it later.
Fixes: 69842cba9a ("sched/uclamp: Add CPU's clamp buckets refcounting")
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lkml.kernel.org/r/20200630112123.12076-2-qais.yousef@arm.com
For some mysterious reason GCC-4.9 has a 64 byte section alignment for
structures, all other GCC versions (and Clang) tested (including 4.8
and 5.0) are fine with the 32 bytes alignment.
Getting this right is important for the new SCHED_DATA macro that
creates an explicitly ordered array of 'struct sched_class' in the
linker script and expect pointer arithmetic to work.
Fixes: c3a340f7e7 ("sched: Have sched_class_highest define by vmlinux.lds.h")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200630144905.GX4817@hirez.programming.kicks-ass.net
While integrating rseq into glibc and replacing glibc's sched_getcpu
implementation with rseq, glibc's tests discovered an issue with
incorrect __rseq_abi.cpu_id field value right after the first time
a newly created process issues sched_setaffinity.
For the records, it triggers after building glibc and running tests, and
then issuing:
for x in {1..2000} ; do posix/tst-affinity-static & done
and shows up as:
error: Unexpected CPU 2, expected 0
error: Unexpected CPU 2, expected 0
error: Unexpected CPU 2, expected 0
error: Unexpected CPU 2, expected 0
error: Unexpected CPU 138, expected 0
error: Unexpected CPU 138, expected 0
error: Unexpected CPU 138, expected 0
error: Unexpected CPU 138, expected 0
This is caused by the scheduler invoking __set_task_cpu() directly from
sched_fork() and wake_up_new_task(), thus bypassing rseq_migrate() which
is done by set_task_cpu().
Add the missing rseq_migrate() to both functions. The only other direct
use of __set_task_cpu() is done by init_idle(), which does not involve a
user-space task.
Based on my testing with the glibc test-case, just adding rseq_migrate()
to wake_up_new_task() is sufficient to fix the observed issue. Also add
it to sched_fork() to keep things consistent.
The reason why this never triggered so far with the rseq/basic_test
selftest is unclear.
The current use of sched_getcpu(3) does not typically require it to be
always accurate. However, use of the __rseq_abi.cpu_id field within rseq
critical sections requires it to be accurate. If it is not accurate, it
can cause corruption in the per-cpu data targeted by rseq critical
sections in user-space.
Reported-By: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-By: Florian Weimer <fweimer@redhat.com>
Cc: stable@vger.kernel.org # v4.18+
Link: https://lkml.kernel.org/r/20200707201505.2632-1-mathieu.desnoyers@efficios.com
The recent commit:
c6e7bd7afa ("sched/core: Optimize ttwu() spinning on p->on_cpu")
moved these lines in ttwu():
p->sched_contributes_to_load = !!task_contributes_to_load(p);
p->state = TASK_WAKING;
up before:
smp_cond_load_acquire(&p->on_cpu, !VAL);
into the 'p->on_rq == 0' block, with the thinking that once we hit
schedule() the current task cannot change it's ->state anymore. And
while this is true, it is both incorrect and flawed.
It is incorrect in that we need at least an ACQUIRE on 'p->on_rq == 0'
to avoid weak hardware from re-ordering things for us. This can fairly
easily be achieved by relying on the control-dependency already in
place.
The second problem, which makes the flaw in the original argument, is
that while schedule() will not change prev->state, it will read it a
number of times (arguably too many times since it's marked volatile).
The previous condition 'p->on_cpu == 0' was sufficient because that
indicates schedule() has completed, and will no longer read
prev->state. So now the trick is to make this same true for the (much)
earlier 'prev->on_rq == 0' case.
Furthermore, in order to make the ordering stick, the 'prev->on_rq = 0'
assignment needs to he a RELEASE, but adding additional ordering to
schedule() is an unwelcome proposition at the best of times, doubly so
for mere accounting.
Luckily we can push the prev->state load up before rq->lock, with the
only caveat that we then have to re-read the state after. However, we
know that if it changed, we no longer have to worry about the blocking
path. This gives us the required ordering, if we block, we did the
prev->state load before an (effective) smp_mb() and the p->on_rq store
needs not change.
With this we end up with the effective ordering:
LOAD p->state LOAD-ACQUIRE p->on_rq == 0
MB
STORE p->on_rq, 0 STORE p->state, TASK_WAKING
which ensures the TASK_WAKING store happens after the prev->state
load, and all is well again.
Fixes: c6e7bd7afa ("sched/core: Optimize ttwu() spinning on p->on_cpu")
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Dave Jones <davej@codemonkey.org.uk>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Link: https://lkml.kernel.org/r/20200707102957.GN117543@hirez.programming.kicks-ass.net
Using a mutex for "print this warning only once" is so overdesigned as
to be actively offensive to my sensitive stomach.
Just use "pr_info_once()" that already does this, although in a
(harmlessly) racy manner that can in theory cause the message to be
printed twice if more than one CPU races on that "is this the first
time" test.
[ If somebody really cares about that harmless data race (which sounds
very unlikely indeed), that person can trivially fix printk_once() by
using a simple atomic access, preferably with an optimistic non-atomic
test first before even bothering to treat the pointless "make sure it
is _really_ just once" case.
A mutex is most definitely never the right primitive to use for
something like this. ]
Yes, this is a small and meaningless detail in a code path that hardly
matters. But let's keep some code quality standards here, and not
accept outrageously bad code.
Link: https://lore.kernel.org/lkml/CAHk-=wgV9toS7GU3KmNpj8hCS9SeF+A0voHS8F275_mgLhL4Lw@mail.gmail.com/
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Reset MXCSR in kernel_fpu_begin() to prevent using a stale user space
value.
- Prevent writing MSR_TEST_CTRL on CPUs which are not explicitly
whitelisted for split lock detection. Some CPUs which do not support
it crash even when the MSR is written to 0 which is the default value.
- Fix the XEN PV fallout of the entry code rework
- Fix the 32bit fallout of the entry code rework
- Add more selftests to ensure that these entry problems don't come back.
- Disable 16 bit segments on XEN PV. It's not supported because XEN PV
does not implement ESPFIX64
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl8B9JoTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoV8LEAC6QJPDvqYUl4r0rNIRG+S6D99lQOse
1smxvgXX4UaRz5Tgz6kvYUcucqmmnTfvnO8cg82LASeFw1xfVPPAtl3GZjoClwhv
0NJkKYcMm5QUOSVjJmjkcbAld//FyRfxHuJ8HMEtrbvkys2qWBmLzMaUNhFDNhcc
73UMmyuyL4kef9v/iAeR5WXG5+b+j9lZDiC1lTWuEKs10d1EdTwt2O/wtSRRPpMn
kL1qGTJAL+iRyRe7weLOkC2KZ9+Gq2NtyJQutkthZtGe5+pLT3AT6AlWxeg1HU8q
pxaQP25oe8/8naIoOmwiuwAP2qmm5eHedzXoN0h7i2XmofYOJaWeF95K7oDro8Nj
2deCx1bk0wr/RUxbYlfUacs8S+wmMWe7+BPnHXZphkSq5Vx+oXIw6mJOqmNb7Yiv
7ld1QwSD5dyWCEk1af16XKsFvSIRiGh8FypfTiTxyk+z7HIWBNXlu8OWHn1A7Sra
iaolCZfXtTJzm4w5+VVT2FX3s7jJrmMM4iSLtM2ISo2k+1HMlTbgLE6/yGjQ3ZaY
U298W7Pm8CwBRgzyKBvZVfncm0U/B0FNo/8C0jsJKPIOdpoLhs+u7sjpyaNC+toz
GE0skoWZxMhga4xPF84ua/l1VGncVUN1d5/dmnXz8xdyxFlktUtkt2iPE4G0rt3S
Xgh2uLHOgST6Kw==
=lI9c
-----END PGP SIGNATURE-----
Merge tag 'x86-urgent-2020-07-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A series of fixes for x86:
- Reset MXCSR in kernel_fpu_begin() to prevent using a stale user
space value.
- Prevent writing MSR_TEST_CTRL on CPUs which are not explicitly
whitelisted for split lock detection. Some CPUs which do not
support it crash even when the MSR is written to 0 which is the
default value.
- Fix the XEN PV fallout of the entry code rework
- Fix the 32bit fallout of the entry code rework
- Add more selftests to ensure that these entry problems don't come
back.
- Disable 16 bit segments on XEN PV. It's not supported because XEN
PV does not implement ESPFIX64"
* tag 'x86-urgent-2020-07-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ldt: Disable 16-bit segments on Xen PV
x86/entry/32: Fix #MC and #DB wiring on x86_32
x86/entry/xen: Route #DB correctly on Xen PV
x86/entry, selftests: Further improve user entry sanity checks
x86/entry/compat: Clear RAX high bits on Xen PV SYSENTER
selftests/x86: Consolidate and fix get/set_eflags() helpers
selftests/x86/syscall_nt: Clear weird flags after each test
selftests/x86/syscall_nt: Add more flag combinations
x86/entry/64/compat: Fix Xen PV SYSENTER frame setup
x86/entry: Move SYSENTER's regs->sp and regs->flags fixups into C
x86/entry: Assert that syscalls are on the right stack
x86/split_lock: Don't write MSR_TEST_CTRL on CPUs that aren't whitelisted
x86/fpu: Reset MXCSR to default in kernel_fpu_begin()
- Ensure the atomicity of affinity updates in the GIC driver.
- Don't try to sleep in atomic context when waiting for the GICv4.1 to
respond. Use polling instead.
- Typo fixes in Kconfig and warnings.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl8B8sETHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoQ2AEAC/W+PkCmAqhTrsr2iHn3JOdXEVW21T
Ybivz2Bbf07kOy3bgvuLuo6/tRPh/R8fwjmrhgUZfWUcLQ1c3VSDyPPtmHVrTK8i
0oZLT1Bc2npCdAX2V8UruRyihhbQk52xb6K2OeDqVhpiuwv0PbcBpsT7No1gg5i2
FaH1F160nJG4HrWusmJKLcqvT+4t/ZWCoYKQeBzhxzZLuFvMU/e+PhUhKbr2MXdx
jBOWv+3DeEd88soxTNsrhsJPdx2i9j1WdSy2kJRhm4DTRWvWxkZ/96ugzWJqIHxV
XZsaKvAyQe9Ft62n44gkEH8RoKyZfCNSQkj3csi14W8WfDkpjCrJZeEh6gHFXewk
GbUJOC9IuwEkhkKRBDD/C0+9v0YxpW1uG/cC2+vIBxZvGqUm9X96UfuhC5F86tcJ
gEUEX0CvMWeBF0TejuOgi72FXlCFpcTGzTLSD0y8RS1I66i226AzkOAsXoZL8OdT
PLdLor7qWMFLrq+ehTkaojCV946j5ClwMK5a95By22QufID04iSsisT5MZqhladO
BdVtHPbApnrB0AXMzMHd69PoAGNeknVdZwYTNE9lNsV8W1WPJxfpnINNX7EYQL77
SOJOEbWa8h7aToy9GJpwODhYByyqeIqHMbvIeWAo3wx9GD5Kt0aqWryZa/b4PehR
W+0VyqdqSkTGIA==
=ZZ0g
-----END PGP SIGNATURE-----
Merge tag 'irq-urgent-2020-07-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
"A set of interrupt chip driver fixes:
- Ensure the atomicity of affinity updates in the GIC driver
- Don't try to sleep in atomic context when waiting for the GICv4.1
to respond. Use polling instead.
- Typo fixes in Kconfig and warnings"
* tag 'irq-urgent-2020-07-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic: Atomically update affinity
irqchip/riscv-intc: Fix a typo in a pr_warn()
irqchip/gic-v4.1: Use readx_poll_timeout_atomic() to fix sleep in atomic
irqchip/loongson-pci-msi: Fix a typo in Kconfig
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl8B8gcTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoXDoEADCdEVs/qLktFXrN17i6Oeju7h6oQ2m
8iI1GDStY5zj4Jk6FdQB864XXHbkNO7C096IqJOZud66k+lj3sEk9lE24tpgZqi/
gHVCueUoKF5ZYNyEtPkSDzHcr/IJg3iueQyShTbGotvGbF/gBAWJtuIq3sVpaD+Q
qvZYASQMkBNrRcEgxzaTr286MJ4lIQ61ujwRWQJV4woQgAqjeTrOKQ+qOoKCZVfB
c1glieDNLwZvs/534zsBLRj7ApvuJ2SyHXhfC9byIitUb1RdZ/1gAkiteX/K6ici
PXoPamBsd+gSEdfWN69HB+cWqPqJ8Gq8M2zcmp3KSrg4IrXTVrnYHmymH3tN5Vbe
p3my9/rH/yDv1kgcRgOlgL7ykz6W2oCr7LrTrQ7fupOXrR7UrW0dSsEcFRbWwoBn
7dlfdEI4Q7ay9GPN2f7QOiaGGE+Bi76iCXTjRTFzcEQHiwO6W1bLoSu8qtncYvke
2PaDrE4V/2CWjOuE37mw3IPsjUEOJNKC2+H3y+J9ma94CX4kAuZzH2LS4oHO8ww1
eyF1HSHOKh1tuY9RhNnsyh+1V2Iao6T0BjUMnG/c01xFeEz+lE+e1JRdTSxa5BOr
zKSSuEei7z6m9t7Yn2DSU8YaKn8ZbF8JpfC5nGFZTMBh64EOEaWGhQUr7ttuFQxi
7JF6WVarhn5FHw==
=p6ry
-----END PGP SIGNATURE-----
Merge tag 'core-urgent-2020-07-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull rcu fixlet from Thomas Gleixner:
"A single fix for a printk format warning in RCU"
* tag 'core-urgent-2020-07-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rcuperf: Fix printk format warning
- fix various bugs in xconfig
- fix some issues in cross-compilation using Clang
- fix documentation
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAl8B7PAVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGKnwP/2lAwlPJYRjBv0cLZ8HI1F8xbVFl
P+/JPkF0me/4mXU0ZJzP+zeLq9rfjL9298FwkVtPoTybPCSsONVEuoIMZ5gCXYxc
IJ6o7pwmSF7T7VNtI6lx+l4BKmULYnPpTnZsqutKKLjAO+o2SiHju3ZSgfWUXVuc
NyQIFSBQzoI1KkbNHpuAryWc0WXm6Gfeg3//Sqqk/pPmXkNcQAIzM274HvrSnCvq
/RLUd0SDTXP7XlbleZQhms1FJp3IXkSXAGdA0HsF8sH8oCz+4wWcXk72k4j9INXi
aPACxZvO/hM4N9LJliL4eJjjHfmbnDppczr+Kb5KYCb9dEV9vVkbYWyz98YA+/Y/
1EgynxLxHrU534M7tAhRAb0k/xVGodhRj0KEMPGRgzeWGtXaxKSe0Z5NBqs0nj5Z
dMtJuzEMG1ey56jhRj2408IUUmOmHEh+IDsM8HQ/tcjrI5fpKP505RT6gvLBWX6M
IPPYo2cC9wHfyEC1dPjJ+aaOeqnSJ9+7ui7iv5vJQ1M6PyOkHeXaU1jCeOGg5qLe
zfjn2U11uK3BLgedzahB/lEBXfblDlFuG0b7XyXnyyhkICt1CT7aPvznrGxSUlEw
EuZLrIMHqRwMyrs7PTo41m2hf3hqh+juhPuxiEngi5NqH/GKBnvLJQDGFkbupqim
ZFGZnMVeZR7Prulc
=0BKC
-----END PGP SIGNATURE-----
Merge tag 'kbuild-fixes-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes frin Masahiro Yamada:
- fix various bugs in xconfig
- fix some issues in cross-compilation using Clang
- fix documentation
* tag 'kbuild-fixes-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
.gitignore: Do not track `defconfig` from `make savedefconfig`
kbuild: make Clang build userprogs for target architecture
kbuild: fix CONFIG_CC_CAN_LINK(_STATIC) for cross-compilation with Clang
kconfig: qconf: parse newer types at debug info
kconfig: qconf: navigate menus on hyperlinks
kconfig: qconf: don't show goback button on splitMode
kconfig: qconf: simplify the goBack() logic
kconfig: qconf: re-implement setSelected()
kconfig: qconf: make debug links work again
kconfig: qconf: make search fully work again on split mode
kconfig: qconf: cleanup includes
docs: kbuild: fix ReST formatting
gcc-plugins: fix gcc-plugins directory path in documentation
Four small fixes in three drivers. The mptfusion one has actually
caused use visible issues in certain kernel configurations.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXwHvbyYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdIvAP9MBVi0
dB0UNCI3vFzp1tSSEADFqBBWPceCoT0eBMOgfwD/Vt57Hxk99OFtoxQf4vkswtry
eZDpyMTwZxEGGQW3XWU=
=VXXZ
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Four small fixes in three drivers.
The mptfusion one has actually caused user visible issues in certain
kernel configurations"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: mptfusion: Don't use GFP_ATOMIC for larger DMA allocations
scsi: libfc: Skip additional kref updating work event
scsi: libfc: Handling of extra kref
scsi: qla2xxx: Fix a condition in qla2x00_find_all_fabric_devs()
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl8BDy4QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpqyUD/0XI7Jo1W63aEwgW9wD1Xiyadc7eKzEFc/x
upfqYBGiRUQehTKdmNBfr1ocrWF9OGj1g4NtPlU81Zjp1Y6c6pBuzeFF6NrfwEVi
GrOO4nm04t4BOIk9AnsIjqknnk2XenbjFZmBNo0TKz3W3ftOPXSNDtJDgjxJ+rGd
y5WOMfFCrE5rvo+JWiG3vxZIfTx8cxtraNw2PWcmxqjwOL+jNiN7E5rW/O4t0+DS
1ajqv5KseTEVtDNKG/Vn04cXxMVG8upG+Jv3xvxu4AlqJk84/va1LxkfvUuPuxJe
c7dbGfR5db/KVdTsHU/WVo6URJ5nioftkMIHgIhOIIJR5D/B7WGFPu5AZtwRze6s
C7BNIF49rBfbxyfLsVdIaAiw8GLQmsJWLs13OEVNRNGDxPO65as74J0E3UO9vOPa
MCKffqkeSVHGK5LaXnhzn0lTEn35StUjWXRuyKAFxTWtSNDptopaoGZCrFO1IFXz
EQfFlwU/fUNyfujAkMq7kNCxeQ0Kh7co6v41zphn8gBanpKgk5AqhnBJOSbI7OAS
TDVMaQTzi3M+kMJV0Fu0rYQ5E3eiY3VAwif3L+6QiccwwgEygwIkdSLo2g63Q7CX
Ogw8J2LIhuwbB/fCHhs7WfgfMRmgQcsaGWFLjI27UYd0FQks+rbDS8DItDuWgiBQ
74skOmzL5w==
=dhze
-----END PGP SIGNATURE-----
Merge tag 'block-5.8-2020-07-05' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- NVMe fixes from Christoph:
- Fix crash in multi-path disk add (Christoph)
- Fix ignore of identify error (Sagi)
- Fix a compiler complaint that a function should be static (Wei)
* tag 'block-5.8-2020-07-05' of git://git.kernel.dk/linux-block:
block: make function __bio_integrity_free() static
nvme: fix a crash in nvme_mpath_add_disk
nvme: fix identify error status silent ignore
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl8BDx4QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpvhzD/4rxzJsn6ukrsxMXFaKIrjZ/hkcRJIMNozz
YWu4PwcDvszvZu66MeAu0tnCttzxlIgP8oCm6cx9ImMQwkYIVbV0q1XJ3wmzUQpZ
pEDW4j0j8hgcLhfZH9ojUAkTP8TnltakxkrwC6egUvnT0vuKDUy5ISbkl4uxWYpH
p4Dq7ASqy8xjtzac/VLTSzBgzhTMSic5NMJY21md9eAaFB1vYBmDyHB3O1bEk4kw
pvWGFm7a4qssnAB61SMfq3nWQ9UA0+XX4a+CWEzJIMqj4H6UpjOCQU23X1AlaLJX
ILeq26PwoZQF8cS4D83tMnmPWz1LqslBgnUuAGCVLsT7omvhDLM75iFBpMzWglLu
No8TlxLZ+Dga04vpjeEptWqSfUS6K879cNJuFGjadBogq06SImIVDHXXTrPhtCGg
B9+uFHkOUlIkjM5h2zqdkmhnbf0sWodowIrx7+aL294QVlqnY0uBR9eh6+CSKT+h
PhJ+FhN+N6B1dTyryaO5hMjyg0h4ZpvIMT3HBpNXtnRVlUT2+OYN3g5HHt6z//Rp
eeJTh7pnY7uT60c8x96kySwQIydXSKBI+7ysLlntgiyvutbzaC5Fq7/f1YTWyNVk
zqM/+FuJUsstu0y/GBEDpglpL1+S9VjNcJUDpUMUKwCAkh7TnI/ATo1rn9GiM1n1
SQZ4HcaCYw==
=Uawr
-----END PGP SIGNATURE-----
Merge tag 'io_uring-5.8-2020-07-05' of git://git.kernel.dk/linux-block
Pull io_uring fix from Jens Axboe:
"Andres reported a regression with the fix that was merged earlier this
week, where his setup of using signals to interrupt io_uring CQ waits
no longer worked correctly.
Fix this, and also limit our use of TWA_SIGNAL to the case where we
need it, and continue using TWA_RESUME for task_work as before.
Since the original is marked for 5.7 stable, let's flush this one out
early"
* tag 'io_uring-5.8-2020-07-05' of git://git.kernel.dk/linux-block:
io_uring: fix regression with always ignoring signals in io_cqring_wait()
Pull i2c fixes from Wolfram Sang:
"The usual driver fixes and documentation updates"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: mlxcpld: check correct size of maximum RECV_LEN packet
i2c: add Kconfig help text for slave mode
i2c: slave-eeprom: update documentation
i2c: eg20t: Load module automatically if ID matches
i2c: designware: platdrv: Set class based on DMI
i2c: algo-pca: Add 0x78 as SCL stuck low status for PCA9665
- fix for missing hazard barrier
- DT fix for ingenic
- DT fix of GPHY names for lantiq
- fix usage of smp_processor_id() while preemption is enabled
-----BEGIN PGP SIGNATURE-----
iQJOBAABCAA4FiEEbt46xwy6kEcDOXoUeZbBVTGwZHAFAl8Bpt8aHHRzYm9nZW5k
QGFscGhhLmZyYW5rZW4uZGUACgkQeZbBVTGwZHBVlA/+JsNymfzLCaUHgEyjDzfp
R7x3/UUNHOF659MKebEIJEd/Rmhj+pg5682e3SugpNlOxuadB7Kl1j0SYNdVbR0h
Dg7ztP3osWbymoJG829xLvpYMlLGLuNaNUBV/8mFNwPy2bijmgkObYyeciFEQlvy
skHihVZCYQ1qVqMtDlNEmMGU0V6JTHuqOfdF7+d7ZmkwHpKGDBgR0BL3rhQREHif
iawKjkhuemgVpw0g6ULpuWlwvsgTbQNoaMIIGIlsaGfYWlyBnlnhbiZHg+WwC5Ey
zzuDFybQq9K+cylgwlrn7ypxCpUBfKCVzYUcEOcQC4+BPt74t1mwfS24FQS3HDol
pQ9lpIPLkm5m0kxokUT8Ei/lcSA1NeiubMOGQJaEc+7gpyBTcw+ChLB242cilngB
CzME5hppGEQlkBefS8CYZaOGUhhU0qaqm6WMkcQ0YIuiyo+ZmwQ6nwyVNbDB/BMb
vK99mgCf96PWqu8vcCcifC+O/wSBOUrMD3vljAswY6xwP9gQ4WYFAcielEXoSVMV
sIlVHNbDivpb6e41zerK+KU9Z1oCgPnFKT6FmkDtdQWQ4iDfOEUi/n72NlNfH5xT
MDGaaYVYuW3M1eR4Tlahe+UA2qIleZDc9DgORhu1kwlxecMfQSaBdKh1G9ifqw58
ZbzM1YrJHHh2xEBvhjpGaZU=
=A4AW
-----END PGP SIGNATURE-----
Merge tag 'mips_fixes_5.8_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:
- fix for missing hazard barrier
- DT fix for ingenic
- DT fix of GPHY names for lantiq
- fix usage of smp_processor_id() while preemption is enabled
* tag 'mips_fixes_5.8_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: Do not use smp_processor_id() in preemptible code
MIPS: Add missing EHB in mtc0 -> mfc0 sequence for DSPen
MIPS: ingenic: gcw0: Fix HP detection GPIO.
MIPS: lantiq: xway: sysctrl: fix the GPHY clock alias names
This resolves the hazard between the mtc0 in the change_c0_status() and
the mfc0 in configure_exception_vector(). Without resolving this hazard
configure_exception_vector() could read an old value and would restore
this old value again. This would revert the changes change_c0_status()
did. I checked this by printing out the read_c0_status() at the end of
per_cpu_trap_init() and the ST0_MX is not set without this patch.
The hazard is documented in the MIPS Architecture Reference Manual Vol.
III: MIPS32/microMIPS32 Privileged Resource Architecture (MD00088), rev
6.03 table 8.1 which includes:
Producer | Consumer | Hazard
----------|----------|----------------------------
mtc0 | mfc0 | any coprocessor 0 register
I saw this hazard on an Atheros AR9344 rev 2 SoC with a MIPS 74Kc CPU.
There the change_c0_status() function would activate the DSPen by
setting ST0_MX in the c0_status register. This was reverted and then the
system got a DSP exception when the DSP registers were saved in
save_dsp() in the first process switch. The crash looks like this:
[ 0.089999] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[ 0.097796] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[ 0.107070] Kernel panic - not syncing: Unexpected DSP exception
[ 0.113470] Rebooting in 1 seconds..
We saw this problem in OpenWrt only on the MIPS 74Kc based Atheros SoCs,
not on the 24Kc based SoCs. We only saw it with kernel 5.4 not with
kernel 4.19, in addition we had to use GCC 8.4 or 9.X, with GCC 8.3 it
did not happen.
In the kernel I bisected this problem to commit 9012d01166 ("compiler:
allow all arches to enable CONFIG_OPTIMIZE_INLINING"), but when this was
reverted it also happened after commit 172dcd935c ("MIPS: Always
allocate exception vector for MIPSr2+").
Commit 0b24cae4d5 ("MIPS: Add missing EHB in mtc0 -> mfc0 sequence.")
does similar changes to a different file. I am not sure if there are
more places affected by this problem.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Running `make savedefconfig` creates by default `defconfig`, which is,
currently, on git’s radar, for example, `git status` lists this file as
untracked.
So, add the file to `.gitignore`, so it’s ignored by git.
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
One fix for a regression in our pkey handling, which exhibits as PROT_EXEC
mappings taking continuous page faults.
Thanks to:
Jan Stancek, Aneesh Kumar K.V.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl8AgLcTHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgPPVD/9ocLMHUsHTOtZCpSjJ/FcQILDapUty
/TRutcDxS8YYEyvUn+Azhsezigr+raeHsiqw7qA9xk763RfCmGNb35GyTGgA/93L
XLN7P04O8tQSz5Nm9AGB2d0HOD7Y/ihabtQ9dZf6SYwEupfP5reA39uj/11gqOlP
6iHhzf0hyjREojy/fjsP+4J5FSukkmte/d2HiRfgwyfAkWvhCsp3cqu7wtciD/8g
kCyl2TY0vpbaRaW6jGNwZD9pp6rVOQzc13F+ery6oqKuVdpmg3ormrBXtE+oATXS
IX4qsO3nRO/nuImaWFW9nkM3s4RNMjNYYWwjC6XrqD6IqlLUgxsESorI6ShvCuiA
kwI2auSxqesdmBNOpVB0qvdShKyVO1Xcy2ImGgHq0YNgwKpkR8e8xKqJKIXxBHOj
jGHneLrH2NzJMIQOwdCPVjoxoQdIcIs2B5L20bCVvtAEtSr4rSPeTpxPEdbikiKv
tQP2eRO9n2LNSxX8+aLdfytzOThS9LDDUa+zAaWl94lNNzcLnCvMjnRoYYHoISq4
x+ncTs1h1j2Z7GmYCLAEOUUuLyDR7HXszxVtIQYK1MHcFz+prcorlKNTZ4ai4Azu
eMDMJvavmXhkAmCkqL6W8CoWsK1sBDdcS2iBWm+EAsyhIHum2YJWDZb4O5bzNQxK
1HBgRcjpsDUWQQ==
=c2Xc
-----END PGP SIGNATURE-----
Merge tag 'powerpc-5.8-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"One fix for a regression in our pkey handling, which exhibits as
PROT_EXEC mappings taking continuous page faults.
Thanks to: Jan Stancek, Aneesh Kumar K.V"
* tag 'powerpc-5.8-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm/pkeys: Make pkey access check work on execute_only_key
- Fix alternative patching for very large kernel images and modules
- Hook up existing CPU errata workarounds for Qualcomm Kryo CPUs
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl8AcEIQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNEIIB/wNrSaAzQT+L8M3uxjy5v2T0eX741X0oi3o
XBP1xQai5i6sANy1HvhvcShdX25pbL9Z4rRVHz6S+u/fpF1z000NItsdRDtw2GYL
bK+rg6yjtaAX9qkrohbEniRVd2HcGec/rQDpBlkv4LMBwTwqs944xcOuDbITTMgk
pqSfHVPTIgxYgsBAkhxvpL4XPZbG9u88Iy62GWIzFmyxatFWj9NbV8hiTVfQGxDC
zCMbGAjIM2lCignIo9xbzHoCTPUb4WJfsLDqlnhLLtrb9VIk1+2+tfQD3mXkaGwQ
4CvsZYl00V+NOPdAPZULB9KAYFad00RwhCKcjaHcdlTqXX2f8+0S
=d4C0
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Nothing earth-shattering, really - some CPU errata workarounds (one
day they'll get it right, ha!) and a fix for a boot failure with very
large kernel images where the alternative patching gets confused when
patching relative branches using veneers.
- Fix alternative patching for very large kernel images and modules
- Hook up existing CPU errata workarounds for Qualcomm Kryo CPUs"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Add KRYO4XX silver CPU cores to erratum list 1530923 and 1024718
arm64: Add KRYO4XX gold CPU cores to erratum list 1463225 and 1418040
arm64: Add MIDR value for KRYO4XX gold CPU cores
arm64/alternatives: use subsections for replacement sequences
When switching to TWA_SIGNAL for task_work notifications, we also made
any signal based condition in io_cqring_wait() return -ERESTARTSYS.
This breaks applications that rely on using signals to abort someone
waiting for events.
Check if we have a signal pending because of queued task_work, and
repeat the signal check once we've run the task_work. This provides a
reliable way of telling the two apart.
Additionally, only use TWA_SIGNAL if we are using an eventfd. If not,
we don't have the dependency situation described in the original commit,
and we can get by with just using TWA_RESUME like we previously did.
Fixes: ce593a6c48 ("io_uring: use signal based task_work running")
Cc: stable@vger.kernel.org # v5.7
Reported-by: Andres Freund <andres@anarazel.de>
Tested-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Xen PV doesn't implement ESPFIX64, so they don't work right. Disable
them. Also print a warning the first time anyone tries to use a
16-bit segment on a Xen PV guest that would otherwise allow it
to help people diagnose this change in behavior.
This gets us closer to having all x86 selftests pass on Xen PV.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/92b2975459dfe5929ecf34c3896ad920bd9e3f2d.1593795633.git.luto@kernel.org
DEFINE_IDTENTRY_MCE and DEFINE_IDTENTRY_DEBUG were wired up as non-RAW
on x86_32, but the code expected them to be RAW.
Get rid of all the macro indirection for them on 32-bit and just use
DECLARE_IDTENTRY_RAW and DEFINE_IDTENTRY_RAW directly.
Also add a warning to make sure that we only hit the _kernel paths
in kernel mode.
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/9e90a7ee8e72fd757db6d92e1e5ff16339c1ecf9.1593795633.git.luto@kernel.org
On Xen PV, #DB doesn't use IST. It still needs to be correctly routed
depending on whether it came from user or kernel mode.
Get rid of DECLARE/DEFINE_IDTENTRY_XEN -- it was too hard to follow the
logic. Instead, route #DB and NMI through DECLARE/DEFINE_IDTENTRY_RAW on
Xen, and do the right thing for #DB. Also add more warnings to the
exc_debug* handlers to make this type of failure more obvious.
This fixes various forms of corruption that happen when usermode
triggers #DB on Xen PV.
Fixes: 4c0dcd8350 ("x86/entry: Implement user mode C entry points for #DB and #MCE")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/4163e733cce0b41658e252c6c6b3464f33fdff17.1593795633.git.luto@kernel.org
Move the clearing of the high bits of RAX after Xen PV joins the SYSENTER
path so that Xen PV doesn't skip it.
Arguably this code should be deleted instead, but that would belong in the
merge window.
Fixes: ffae641f57 ("x86/entry/64/compat: Fix Xen PV SYSENTER frame setup")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/9d33b3f3216dcab008070f1c28b6091ae7199969.1593795633.git.luto@kernel.org
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCXwAncgAKCRCAXGG7T9hj
vnLwAQCoKia3CSIzLZ6MMx/dWF+ntr3frTk2g8J02MURAA1GhgEAxybfOoOMJi0P
8+1UsjLWidW3Zh0UJHK5q9xqRw/Jkg0=
=mkfk
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.8b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"One small cleanup patch for ARM and two patches for the xenbus driver
fixing latent problems (large stack allocations and bad return code
settings)"
* tag 'for-linus-5.8b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/xenbus: let xenbus_map_ring_valloc() return errno values only
xen/xenbus: avoid large structs and arrays on the stack
arm/xen: remove the unused macro GRANT_TABLE_PHYSADDR
I2C_SMBUS_BLOCK_MAX defines already the maximum number as defined in the
SMBus 2.0 specs. I don't see a reason to add 1 here. Also, fix the errno
to what is suggested for this error.
Fixes: c9bfdc7c16 ("i2c: mlxcpld: Add support for smbus block read transaction")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Michael Shych <michaelsh@mellanox.com>
Tested-by: Michael Shych <michaelsh@mellanox.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Pull sysctl fix from Al Viro:
"Another regression fix for sysctl changes this cycle..."
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
Call sysctl_head_finish on error
I can't recall why there was none, but we surely want to have it.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Add more details which have either been missing ever since or describe
recent additions.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Luca Ceresoli <luca@lucaceresoli.net>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
The driver can't be loaded automatically because it misses
module alias to be provided. Add corresponding MODULE_DEVICE_TABLE()
call to the driver.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Current AMD's zen-based APUs use this core for some of its i2c-buses.
With this patch we re-enable autodetection of hwmon-alike devices, so
lm-sensors will be able to work automatically.
It does not affect the boot-time of embedded devices, as the class is
set based on the DMI information.
DMI is probed only on Qtechnology QT5222 Industrial Camera Platform.
DocLink: https://qtec.com/camera-technology-camera-platforms/
Fixes: 3eddad96c4 ("i2c: designware: reverts "i2c: designware: Add support for AMD I2C controller"")
Signed-off-by: Ricardo Ribalda <ribalda@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
The PCA9665 datasheet says that I2CSTA = 78h indicates that SCL is stuck
low, this differs to the PCA9564 which uses 90h for this indication.
Treat either 0x78 or 0x90 as an indication that the SCL line is stuck.
Based on looking through the PCA9564 and PCA9665 datasheets this should
be safe for both chips. The PCA9564 should not return 0x78 for any valid
state and the PCA9665 should not return 0x90.
Fixes: eff9ec95ef ("i2c-algo-pca: Add PCA9665 support")
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAl7/s1YACgkQiiy9cAdy
T1GhzgwAqARAg1iCUEDjyy7VEZ9HNA3X87GxM7zkid5Fz2WTDlHLBQL6LWZkLODK
PIz8IP4V3DoBddN2DGlqIiCZmCMDn2bBN+6u1O2TkR2lv2w3ASxzwYSMQWqUUw6U
a03BkDZNFE4fJq5pPDdVaVzDss4tuNKW8N5RvptRqlbLp74SRUgMjVyyWwN4UunW
AHH3VqRCWJJj6Yp6MAx3rtoEiAtjTt9Ej3Fb2MXdF5jZObzI3LOY13Z09QIWbE3P
Sh7Py66CSG7UYYkQqoe43zYwxeOgo6FAYWxIULTPJYdFIi5+RHPQ0SYc6+BHfDRo
AHMchJpwZ/j4JOeIJGDItuUQPVnwYAOZ+75s7ofhAbG95kwcfs+AkDoLqkM8IWpu
LS5rHi7sOA4GK8Hio9xp+MgttsmXRcnBQ4ShBoTaDBKa7v/NeRAPolsD5FgZWunO
CKRDsDD5hKO2bQsJk4te35/IQpxRpEiiGmMpyaNUdaCdhXxcPHCYEWYdw9EnTP6i
1xc7au/u
=laR3
-----END PGP SIGNATURE-----
Merge tag '5.8-rc3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Eight cifs/smb3 fixes, most when specifying the multiuser mount flag.
Five of the fixes are for stable"
* tag '5.8-rc3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: prevent truncation from long to int in wait_for_free_credits
cifs: Fix the target file was deleted when rename failed.
SMB3: Honor 'posix' flag for multiuser mounts
SMB3: Honor 'handletimeout' flag for multiuser mounts
SMB3: Honor lease disabling for multiuser mounts
SMB3: Honor persistent/resilient handle flags for multiuser mounts
SMB3: Honor 'seal' flag for multiuser mounts
cifs: Display local UID details for SMB sessions in DebugData
Accumulated hwmon patches:
- Fix typo in Kconfig SENSORS_IR35221 option
- Fix potential memory leak in acpi_power_meter_add()
- Make sure the OVERT mask is set correctly in max6697 driver
- In PMBus core, fix page vs. register when accessing fans
- Mark is_visible functions static in bt1-pvt driver
- Define Temp- and Volt-to-N poly as maybe-unused in bt1-pvt driver
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEiHPvMQj9QTOCiqgVyx8mb86fmYEFAl7/quEACgkQyx8mb86f
mYGSeg/7BaJI66dvQ+r+nRqlyEwJxK7j2a7EvOPwNUNcbBYp5uooXpK+xXcSShYn
bwPJVREWz6jY5F+pi7vehpXPhRLwdNRZulElYi6GEgd6BvzVQwVcQHAhajqAjvDN
2ILd+22IAmrWdCfHT0dNvt4+p0HOHsHKva7Jo9HsnLgQVDENg/0iUjKhGSCc3J5Q
542cdLlO0ExZSW/K1cWL5IazQiVSMeH+VE+53NKOMurj2vMdK0wIoy0NDMSj1sMO
8/SzvibKZuFUqKgzu3p6IsUSWH2a+v9GsrxvymcfQ7TnsYH7HsdpjA/ggZ1PL9vQ
bhUkVPRsZwLYbo9YAyH2d64JLLOl2GIRpdfr8yn1wQLsbwpByu1i/uXtdZM0v5Qo
A8gaqP+ptxVAxtxErDiaw8S4grHIKdsJQuRoh3rwszmPcJ2kKB4V/jPVtrD7v/ag
eBn9yGx61WSr81I4jY7l+Keg6RTkOXD5KXGinzazWUb6vc/aaN0vLz/Rn4f4Ha8q
HIFrCciMDaSaecXkT7K6ZVt3Oha4++xWo1ztTJGUC9Lhj5d9AQn0hrO2FiPI6UAs
qwVNiEwtNDcbVaZTbAsgepEYS9d3ABcz/Nu9tSjCUxqOKt95+/ngh+bBX45fd68q
fEajLN6Ehy5Lzfrkq30ViUCjE+gxim6zWZpubV91iZCqO7g5EL0=
=3guP
-----END PGP SIGNATURE-----
Merge tag 'hwmon-for-v5.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- Fix typo in Kconfig SENSORS_IR35221 option
- Fix potential memory leak in acpi_power_meter_add()
- Make sure the OVERT mask is set correctly in max6697 driver
- In PMBus core, fix page vs. register when accessing fans
- Mark is_visible functions static in bt1-pvt driver
- Define Temp- and Volt-to-N poly as maybe-unused in bt1-pvt driver
* tag 'hwmon-for-v5.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (pmbus) fix a typo in Kconfig SENSORS_IR35221 option
hwmon: (acpi_power_meter) Fix potential memory leak in acpi_power_meter_add()
hwmon: (max6697) Make sure the OVERT mask is set correctly
hwmon: (pmbus) Fix page vs. register when accessing fans
hwmon: (bt1-pvt) Mark is_visible functions static
hwmon: (bt1-pvt) Define Temp- and Volt-to-N poly as maybe-unused
Merge misc fixes from Andrew Morton:
"Subsystems affected by this patch series: mm/hugetlb, samples, mm/cma,
mm/vmalloc, mm/pagealloc"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm/page_alloc: fix documentation error
vmalloc: fix the owner argument for the new __vmalloc_node_range callers
mm/cma.c: use exact_nid true to fix possible per-numa cma leak
samples/vfs: avoid warning in statx override
mm/hugetlb.c: fix pages per hugetlb calculation
When I increased the upper bound of the min_free_kbytes value in
ee8eb9a5fe ("mm/page_alloc: increase default min_free_kbytes bound") I
forgot to tweak the above comment to reflect the new value. This patch
fixes that mistake.
Signed-off-by: Joel Savitz <jsavitz@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Fabrizio D'Angelo <fdangelo@redhat.com>
Link: http://lkml.kernel.org/r/20200624221236.29560-1-jsavitz@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the recently added new __vmalloc_node_range callers to pass the
correct values as the owner for display in /proc/vmallocinfo.
Fixes: 800e26b813 ("x86/hyperv: allocate the hypercall page with only read and execute bits")
Fixes: 10d5e97c1b ("arm64: use PAGE_KERNEL_ROX directly in alloc_insn_page")
Fixes: 7a0e27b2a0 ("mm: remove vmalloc_exec")
Reported-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200627075649.2455097-1-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Calling cma_declare_contiguous_nid() with false exact_nid for per-numa
reservation can easily cause cma leak and various confusion. For example,
mm/hugetlb.c is trying to reserve per-numa cma for gigantic pages. But it
can easily leak cma and make users confused when system has memoryless
nodes.
In case the system has 4 numa nodes, and only numa node0 has memory. if
we set hugetlb_cma=4G in bootargs, mm/hugetlb.c will get 4 cma areas for 4
different numa nodes. since exact_nid=false in current code, all 4 numa
nodes will get cma successfully from node0, but hugetlb_cma[1 to 3] will
never be available to hugepage will only allocate memory from
hugetlb_cma[0].
In case the system has 4 numa nodes, both numa node0&2 has memory, other
nodes have no memory. if we set hugetlb_cma=4G in bootargs, mm/hugetlb.c
will get 4 cma areas for 4 different numa nodes. since exact_nid=false in
current code, all 4 numa nodes will get cma successfully from node0 or 2,
but hugetlb_cma[1] and [3] will never be available to hugepage as
mm/hugetlb.c will only allocate memory from hugetlb_cma[0] and
hugetlb_cma[2]. This causes permanent leak of the cma areas which are
supposed to be used by memoryless node.
Of cource we can workaround the issue by letting mm/hugetlb.c scan all cma
areas in alloc_gigantic_page() even node_mask includes node0 only. that
means when node_mask includes node0 only, we can get page from
hugetlb_cma[1] to hugetlb_cma[3]. But this will cause kernel crash in
free_gigantic_page() while it wants to free page by:
cma_release(hugetlb_cma[page_to_nid(page)], page, 1 << order)
On the other hand, exact_nid=false won't consider numa distance, it might
be not that useful to leverage cma areas on remote nodes. I feel it is
much simpler to make exact_nid true to make everything clear. After that,
memoryless nodes won't be able to reserve per-numa CMA from other nodes
which have memory.
Fixes: cf11e85fc0 ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Aslan Bakirov <aslan@fb.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Andreas Schaufler <andreas.schaufler@gmx.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200628074345.27228-1-song.bao.hua@hisilicon.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Something changed recently to uncover this warning:
samples/vfs/test-statx.c:24:15: warning: `struct foo' declared inside parameter list will not be visible outside of this definition or declaration
24 | #define statx foo
| ^~~
Which is due the use of "struct statx" (here, "struct foo") in a function
prototype argument list before it has been defined:
int
# 56 "/usr/include/x86_64-linux-gnu/bits/statx-generic.h"
foo
# 56 "/usr/include/x86_64-linux-gnu/bits/statx-generic.h" 3 4
(int __dirfd, const char *__restrict __path, int __flags,
unsigned int __mask, struct
# 57 "/usr/include/x86_64-linux-gnu/bits/statx-generic.h"
foo
# 57 "/usr/include/x86_64-linux-gnu/bits/statx-generic.h" 3 4
*__restrict __buf)
__attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (2, 5)));
Add explicit struct before #include to avoid warning.
Fixes: f1b5618e01 ("vfs: Add a sample program for the new mount API")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Link: http://lkml.kernel.org/r/202006282213.C516EA6@keescook
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The routine hpage_nr_pages() was incorrectly used to calculate the number
of base pages in a hugetlb page. hpage_nr_pages is designed to be called
for THP pages and will return HPAGE_PMD_NR for hugetlb pages of any size.
Due to the context in which hpage_nr_pages was called, it is unlikely to
produce a user visible error. The routine with the incorrect call is only
exercised in the case of hugetlb memory error or migration. In addition,
this would need to be on an architecture which supports huge page sizes
less than PMD_SIZE. And, the vma containing the huge page would also need
to smaller than PMD_SIZE.
Fixes: c0d0381ade ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization")
Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200629185003.97202-1-mike.kravetz@oracle.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Fix a use-after-free bug when the fs shuts down.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAl77c9oACgkQ+H93GTRK
tOvYow/+O61y32kJzJ6xJ5QoGE5K10CXp4cpBwOgG/6POERIDBirgAMqKAx8rEu2
go36qUVzwyveaB4iyuIkw5K+odZbHpGiuQqiGu8Aw4XBEAEhDqPPsnHhllSi/VOq
4AjvfYefmj0ALQay1pzGpR2h5+03JwOw0ZFcmBl5QSTMLwZQZ1PoU8ujiYPSxsUr
m9dcGZtGU16mmDgORzYTDnSYSKruhJDSD5IxsID+QST0wK4MuPkJr0T+ZQmziWb1
xmHE1aTGDZcrYG4+x2Pzop822mrnuMBnMaX8KOOtiZAhtKb19sAf0OUWBkWOQYvb
Vk3mIDDz910vd/szw3B5KMVNkeYoRtHAQztpLyfJ3Gtxmt5g/4ZX2fEX/KWJbV5D
82GLf4gB4GhTQFUwcTmgtmCCd87aH0ABiCEnURN6R04tOCXgc7bYn0XXvZL6Axd/
25bkhlDdmOfveAZrZ1WKWSEYN/at9R5iqYsbWH1FmoE6h82OvZDxweB7P4r66KwT
pMhYRqKrRrwNtarPmn6bC/8Ci5h6vl8MOP3+mTYx2eXU0aFe8OCYpPNFsSufxx+e
hHTHUKxCSQDfYHbIqXVF5F8G7msh4ue+687UIo7seWyzTU4PaphH0kaHKV/+fiY1
zU9+GNjk0iVZy6cG+25ZcMPQmYv7qdrvm6XuxciywFim7l2wq0s=
=BnJO
-----END PGP SIGNATURE-----
Merge tag 'xfs-5.8-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fix from Darrick Wong:
"Fix a use-after-free bug when the fs shuts down"
* tag 'xfs-5.8-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix use-after-free on CIL context on shutdown
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl7/VbwUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vycww//ZRzj5OKP2H+J5LouaVYsltQtaqFb
vNWIVjZnB2MbbNprDMtvulBI7+u2+j9TynlRQIAJKKIazKjtTjqApLDYlsIKXb5c
fjmqGKIL72uFhIedP8eHif/WGgFTQqyD50M95Uiu9ik0AhgpLUkfordss4+2+fdk
bDaXEXpaCkDjMcQ164lFxYrpBGih9YYvRShqfZmmcyfDJrxMfKXM0Iprx7MM8uTA
0BiUaayr1MRkWKGm+zT8MTeMZkalrfjMukyQnCeca19KFNe14m2GG/5mp94G8PZQ
OflVjWkwoC5VCmn7bvY1rkUbqAmjIJURQVtnMo1D2UI01TPkHzIkH7ZBGOfouGib
CCQuzw6UhPT8T41Lucd/+bxKj041wWYormgEq4oZX0hDPim0vxEg/hRONeZ2fDpO
ZzFevKFYX643Gjb+c/dKsGfbOWWjE4FcD5slKVp0c4wAcNhq/p3wxJuDxkh9Gn2J
YzIjLLsD3+sAndYn2PgfSLUpAQn3c3x02RbSzpEeqcFSo+Gq/4RCeX3kjAB9tzCL
Trl+6lsSrAYNaiXgFWfmYXtdtUOnQljYOuQgrTtvu5ugaUxTy2oAhecLqsy68cf0
OBrzALIsN1xwbgGxak8dOzzcmHENmL246jI1GFU1VXpTvbvtO7r1prv8yVo04b8+
sBJDXRISQNYCaqY=
=ydSI
-----END PGP SIGNATURE-----
Merge tag 'pci-v5.8-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fix from Bjorn Helgaas:
"Fix a pcie_find_root_port() simplification that broke power management
because it didn't handle the edge case of finding the Root Port of a
Root Port itself (Mika Westerberg)""
* tag 'pci-v5.8-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: Make pcie_find_root_port() work for Root Ports