Commit Graph

19384 Commits

Author SHA1 Message Date
Linus Torvalds
71998d1be4 Merge branch 'x86-irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 irq fixes from Ingo Molnar:
 "Two changes: a cpu-hotplug/irq race fix, plus a HyperV related fix"

* 'x86-irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/irq: Fix fixup_irqs() error handling
  x86, irq, pic: Probe for legacy PIC and set legacy_pic appropriately
2014-06-12 20:03:47 -07:00
Linus Torvalds
3737a12761 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull more perf updates from Ingo Molnar:
 "A second round of perf updates:

   - wide reaching kprobes sanitization and robustization, with the hope
     of fixing all 'probe this function crashes the kernel' bugs, by
     Masami Hiramatsu.

   - uprobes updates from Oleg Nesterov: tmpfs support, corner case
     fixes and robustization work.

   - perf tooling updates and fixes from Jiri Olsa, Namhyung Ki, Arnaldo
     et al:
        * Add support to accumulate hist periods (Namhyung Kim)
        * various fixes, refactorings and enhancements"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (101 commits)
  perf: Differentiate exec() and non-exec() comm events
  perf: Fix perf_event_comm() vs. exec() assumption
  uprobes/x86: Rename arch_uprobe->def to ->defparam, minor comment updates
  perf/documentation: Add description for conditional branch filter
  perf/x86: Add conditional branch filtering support
  perf/tool: Add conditional branch filter 'cond' to perf record
  perf: Add new conditional branch filter 'PERF_SAMPLE_BRANCH_COND'
  uprobes: Teach copy_insn() to support tmpfs
  uprobes: Shift ->readpage check from __copy_insn() to uprobe_register()
  perf/x86: Use common PMU interrupt disabled code
  perf/ARM: Use common PMU interrupt disabled code
  perf: Disable sampled events if no PMU interrupt
  perf: Fix use after free in perf_remove_from_context()
  perf tools: Fix 'make help' message error
  perf record: Fix poll return value propagation
  perf tools: Move elide bool into perf_hpp_fmt struct
  perf tools: Remove elide setup for SORT_MODE__MEMORY mode
  perf tools: Fix "==" into "=" in ui_browser__warning assignment
  perf tools: Allow overriding sysfs and proc finding with env var
  perf tools: Consider header files outside perf directory in tags target
  ...
2014-06-12 19:18:49 -07:00
Linus Torvalds
c29deef32e Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull more locking changes from Ingo Molnar:
 "This is the second round of locking tree updates for v3.16, offering
  large system scalability improvements:

 - optimistic spinning for rwsems, from Davidlohr Bueso.

 - 'qrwlocks' core code and x86 enablement, from Waiman Long and PeterZ"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, locking/rwlocks: Enable qrwlocks on x86
  locking/rwlocks: Introduce 'qrwlocks' - fair, queued rwlocks
  locking/mutexes: Documentation update/rewrite
  locking/rwsem: Fix checkpatch.pl warnings
  locking/rwsem: Fix warnings for CONFIG_RWSEM_GENERIC_SPINLOCK
  locking/rwsem: Support optimistic spinning
2014-06-12 18:48:15 -07:00
Linus Torvalds
f9da455b93 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov.

 2) Multiqueue support in xen-netback and xen-netfront, from Andrew J
    Benniston.

 3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn
    Mork.

 4) BPF now has a "random" opcode, from Chema Gonzalez.

 5) Add more BPF documentation and improve test framework, from Daniel
    Borkmann.

 6) Support TCP fastopen over ipv6, from Daniel Lee.

 7) Add software TSO helper functions and use them to support software
    TSO in mvneta and mv643xx_eth drivers.  From Ezequiel Garcia.

 8) Support software TSO in fec driver too, from Nimrod Andy.

 9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli.

10) Handle broadcasts more gracefully over macvlan when there are large
    numbers of interfaces configured, from Herbert Xu.

11) Allow more control over fwmark used for non-socket based responses,
    from Lorenzo Colitti.

12) Do TCP congestion window limiting based upon measurements, from Neal
    Cardwell.

13) Support busy polling in SCTP, from Neal Horman.

14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru.

15) Bridge promisc mode handling improvements from Vlad Yasevich.

16) Don't use inetpeer entries to implement ID generation any more, it
    performs poorly, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits)
  rtnetlink: fix userspace API breakage for iproute2 < v3.9.0
  tcp: fixing TLP's FIN recovery
  net: fec: Add software TSO support
  net: fec: Add Scatter/gather support
  net: fec: Increase buffer descriptor entry number
  net: fec: Factorize feature setting
  net: fec: Enable IP header hardware checksum
  net: fec: Factorize the .xmit transmit function
  bridge: fix compile error when compiling without IPv6 support
  bridge: fix smatch warning / potential null pointer dereference
  via-rhine: fix full-duplex with autoneg disable
  bnx2x: Enlarge the dorq threshold for VFs
  bnx2x: Check for UNDI in uncommon branch
  bnx2x: Fix 1G-baseT link
  bnx2x: Fix link for KR with swapped polarity lane
  sctp: Fix sk_ack_backlog wrap-around problem
  net/core: Add VF link state control policy
  net/fsl: xgmac_mdio is dependent on OF_MDIO
  net/fsl: Make xgmac_mdio read error message useful
  net_sched: drr: warn when qdisc is not work conserving
  ...
2014-06-12 14:27:40 -07:00
Linus Torvalds
682b7c1c8e Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
 "This is the main drm merge window pull request, changes all over the
  place, mostly normal levels of churn.

  Highlights:

  Core drm:
     More cleanups, fix race on connector/encoder naming, docs updates,
     object locking rework in prep for atomic modeset

  i915:
     mipi DSI support, valleyview power fixes, cursor size fixes,
     execlist refactoring, vblank improvements, userptr support, OOM
     handling improvements

  radeon:
     GPUVM tuning and large page size support, gart fixes, deep color
     HDMI support, HDMI audio cleanups

  nouveau:
     - displayport rework should fix lots of issues
     - initial gk20a support
     - gk110b support
     - gk208 fixes

  exynos:
     probe order fixes, HDMI changes, IPP consolidation

  msm:
     debugfs updates, misc fixes

  ast:
     ast2400 support, sync with UMS driver

  tegra:
     cleanups, hdmi + hw cursor for Tegra 124.

  panel:
     fixes existing panels add some new ones.

  ipuv3:
     moved from staging to drivers/gpu"

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (761 commits)
  drm/nouveau/disp/dp: fix tmds passthrough on dp connector
  drm/nouveau/dp: probe dpcd to determine connectedness
  drm/nv50-: trigger update after all connectors disabled
  drm/nv50-: prepare for attaching a SOR to multiple heads
  drm/gf119-/disp: fix debug output on update failure
  drm/nouveau/disp/dp: make use of postcursor when its available
  drm/g94-/disp/dp: take max pullup value across all lanes
  drm/nouveau/bios/dp: parse lane postcursor data
  drm/nouveau/dp: fix support for dpms
  drm/nouveau: register a drm_dp_aux channel for each dp connector
  drm/g94-/disp: add method to power-off dp lanes
  drm/nouveau/disp/dp: maintain link in response to hpd signal
  drm/g94-/disp: bash and wait for something after changing lane power regs
  drm/nouveau/disp/dp: split link config/power into two steps
  drm/nv50/disp: train PIOR-attached DP from second supervisor
  drm/nouveau/disp/dp: make use of existing output data for link training
  drm/gf119/disp: start removing direct vbios parsing from supervisor
  drm/nv50/disp: start removing direct vbios parsing from supervisor
  drm/nouveau/disp/dp: maintain receiver caps in response to hpd signal
  drm/nouveau/disp/dp: create subclass for dp outputs
  ...
2014-06-12 11:32:30 -07:00
Alexei Starovoitov
e430f34ee5 net: filter: cleanup A/X name usage
The macro 'A' used in internal BPF interpreter:
 #define A regs[insn->a_reg]
was easily confused with the name of classic BPF register 'A', since
'A' would mean two different things depending on context.

This patch is trying to clean up the naming and clarify its usage in the
following way:

- A and X are names of two classic BPF registers

- BPF_REG_A denotes internal BPF register R0 used to map classic register A
  in internal BPF programs generated from classic

- BPF_REG_X denotes internal BPF register R7 used to map classic register X
  in internal BPF programs generated from classic

- internal BPF instruction format:
struct sock_filter_int {
        __u8    code;           /* opcode */
        __u8    dst_reg:4;      /* dest register */
        __u8    src_reg:4;      /* source register */
        __s16   off;            /* signed offset */
        __s32   imm;            /* signed immediate constant */
};

- BPF_X/BPF_K is 1 bit used to encode source operand of instruction
In classic:
  BPF_X - means use register X as source operand
  BPF_K - means use 32-bit immediate as source operand
In internal:
  BPF_X - means use 'src_reg' register as source operand
  BPF_K - means use 32-bit immediate as source operand

Suggested-by: Chema Gonzalez <chema@google.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Chema Gonzalez <chema@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:13:16 -07:00
Linus Torvalds
dfb945473a Merge git://www.linux-watchdog.org/linux-watchdog
Pull watchdog updates from Wim Van Sebroeck:
 "This contains:
   - addition of the Intel MID watchdog
   - removal of W83697HF and W83697UG drivers (code was merged into
     w83627hf_wdt driver)
   - addition of Armada 375/380 SoC support
   - conversion of imx2_wdt to regmap API and to watchdog core API
   - lots of other small improvements and fixes"

[ Wim was also tagged by gmail as a spammer, but not delayed by days
  unlike Ben ]

* git://www.linux-watchdog.org/linux-watchdog: (25 commits)
  x86: intel-mid: add watchdog platform code for Merrifield
  watchdog: add Intel MID watchdog driver support
  watchdog: sp805: Set watchdog_device->timeout from ->set_timeout()
  booke/watchdog: refine and clean up the codes
  watchdog: iop_wdt only builds for mach-iop13xx
  watchdog: Remove drivers for W83697HF and W83697UG
  watchdog: w83627hf_wdt: Add early_disable module parameter
  ARM: mvebu: Add A375/A380 watchdog binding documentation
  watchdog: orion: Add Armada 375/380 SoC support
  watchdog: orion: Introduce per-SoC enabled() function
  watchdog: orion: Introduce per-SoC stop() function
  watchdog: orion: Remove unneeded atomic access
  watchdog: orion: Introduce a SoC-specific RSTOUT mapping
  watchdog: orion: Move the register ioremap'ing to its own function
  watchdog: xilinx: Make of_device_id array const
  watchdog: imx2_wdt: convert to watchdog core api
  watchdog: imx2_wdt: convert to use regmap API.
  watchdog: imx2_wdt: Sort the header files alphabetically
  watchdog: ath79_wdt: switch to clk_prepare/clk_disable
  watchdog: ath79_wdt: avoid spurious restarts on AR934x
  ...
2014-06-10 19:16:36 -07:00
David Cohen
78a3bb9e40 x86: intel-mid: add watchdog platform code for Merrifield
This patch adds platform code for Intel Merrifield.
Since the watchdog is not part of SFI table, we have no other option but
to manually register watchdog's platform device (argh!).

Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2014-06-10 21:48:20 +02:00
Linus Torvalds
214b931320 Lots of tweaks, small fixes, optimizations, and some helper functions
to help out the rest of the kernel to ease their use of trace events.
 
 The big change for this release is the allowing of other tracers,
 such as the latency tracers, to be used in the trace instances and allow
 for function or function graph tracing to be in the top level
 simultaneously.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJTlbUqAAoJEKQekfcNnQGuP+8H+wTBG06beHsqe6XcaeXcKNkt
 Mimm0O04oQdw89CBWeJvXyOwRTtiN4M/4hxHXBTDtChxM9oUyWw1o0IpSMMuQ16O
 w9r3DfC8e1air+ufEuYWM0QNtyzHi8EfDSNia55ON5jvtkCZTXOEKZD+n8M9w28p
 I7PVgr0PDztsCpethCpg0M8beK9zuQPWMzsHAQCsKI06Xl5z33kPIJR15Exh+Kr1
 uVVTZW7JFVAPuSnteLSIx9pN6OjsVGzOZCljg+O+9/v/02u5nkMiS2nURxae86kg
 RTSiRYT6Hvl/MCBhdss/w5kgSk6BYiZ0hXbLtwetvre+vQrOR5CnDw2DxZ7e+gU=
 =oudH
 -----END PGP SIGNATURE-----

Merge tag 'trace-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "Lots of tweaks, small fixes, optimizations, and some helper functions
  to help out the rest of the kernel to ease their use of trace events.

  The big change for this release is the allowing of other tracers, such
  as the latency tracers, to be used in the trace instances and allow
  for function or function graph tracing to be in the top level
  simultaneously"

* tag 'trace-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits)
  tracing: Fix memory leak on instance deletion
  tracing: Fix leak of ring buffer data when new instances creation fails
  tracing/kprobes: Avoid self tests if tracing is disabled on boot up
  tracing: Return error if ftrace_trace_arrays list is empty
  tracing: Only calculate stats of tracepoint benchmarks for 2^32 times
  tracing: Convert stddev into u64 in tracepoint benchmark
  tracing: Introduce saved_cmdlines_size file
  tracing: Add __get_dynamic_array_len() macro for trace events
  tracing: Remove unused variable in trace_benchmark
  tracing: Eliminate double free on failure of allocation on boot up
  ftrace/x86: Call text_ip_addr() instead of the duplicated code
  tracing: Print max callstack on stacktrace bug
  tracing: Move locking of trace_cmdline_lock into start/stop seq calls
  tracing: Try again for saved cmdline if failed due to locking
  tracing: Have saved_cmdlines use the seq_read infrastructure
  tracing: Add tracepoint benchmark tracepoint
  tracing: Print nasty banner when trace_printk() is in use
  tracing: Add funcgraph_tail option to print function name after closing braces
  tracing: Eliminate duplicate TRACE_GRAPH_PRINT_xx defines
  tracing: Add __bitmask() macro to trace events to cpumasks and other bitmasks
  ...
2014-06-09 16:39:15 -07:00
Linus Torvalds
ed81e780a7 Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 vdso build fix from Peter Anvin:
 "This fixes building the vdso code on older Linux systems, and probably
  some non-Linux systems"

* 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, vdso: Use <tools/le_byteshift.h> for littleendian access
2014-06-08 12:26:43 -07:00
Linus Torvalds
3f17ea6dea Merge branch 'next' (accumulated 3.16 merge window patches) into master
Now that 3.15 is released, this merges the 'next' branch into 'master',
bringing us to the normal situation where my 'master' branch is the
merge window.

* accumulated work in next: (6809 commits)
  ufs: sb mutex merge + mutex_destroy
  powerpc: update comments for generic idle conversion
  cris: update comments for generic idle conversion
  idle: remove cpu_idle() forward declarations
  nbd: zero from and len fields in NBD_CMD_DISCONNECT.
  mm: convert some level-less printks to pr_*
  MAINTAINERS: adi-buildroot-devel is moderated
  MAINTAINERS: add linux-api for review of API/ABI changes
  mm/kmemleak-test.c: use pr_fmt for logging
  fs/dlm/debug_fs.c: replace seq_printf by seq_puts
  fs/dlm/lockspace.c: convert simple_str to kstr
  fs/dlm/config.c: convert simple_str to kstr
  mm: mark remap_file_pages() syscall as deprecated
  mm: memcontrol: remove unnecessary memcg argument from soft limit functions
  mm: memcontrol: clean up memcg zoneinfo lookup
  mm/memblock.c: call kmemleak directly from memblock_(alloc|free)
  mm/mempool.c: update the kmemleak stack trace for mempool allocations
  lib/radix-tree.c: update the kmemleak stack trace for radix tree allocations
  mm: introduce kmemleak_update_trace()
  mm/kmemleak.c: use %u to print ->checksum
  ...
2014-06-08 11:31:16 -07:00
Linus Torvalds
bb077d6006 Revert "x86/smpboot: Initialize secondary CPU only if master CPU will wait for it"
This reverts commit 3e1a878b7c.

It came in very late, and already has one reported failure: Sitsofe
reports that the current tree fails to boot on his EeePC, and bisected
it down to this.  Rather than waste time trying to figure out what's
wrong, just revert it.

Reported-by: Sitsofe Wheeler <sitsofe@gmail.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-08 10:09:49 -07:00
Linus Torvalds
639b4ac691 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6 into next
Pull crypto updates from Herbert Xu:
 "Here is the crypto update for 3.16:

   - Added test vectors for SHA/AES-CCM/DES-CBC/3DES-CBC.
   - Fixed a number of error-path memory leaks in tcrypt.
   - Fixed error-path memory leak in caam.
   - Removed unnecessary global mutex from mxs-dcp.
   - Added ahash walk interface that can actually be asynchronous.
   - Cleaned up caam error reporting.
   - Allow crypto_user get operation to be used by non-root users.
   - Add support for SSS module on Exynos.
   - Misc fixes"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6: (60 commits)
  crypto: testmgr - add aead cbc des, des3_ede tests
  crypto: testmgr - Fix DMA-API warning
  crypto: cesa - tfm->__crt_alg->cra_type directly
  crypto: sahara - tfm->__crt_alg->cra_name directly
  crypto: padlock - tfm->__crt_alg->cra_name directly
  crypto: n2 - tfm->__crt_alg->cra_name directly
  crypto: dcp - tfm->__crt_alg->cra_name directly
  crypto: cesa - tfm->__crt_alg->cra_name directly
  crypto: ccp - tfm->__crt_alg->cra_name directly
  crypto: geode - Don't use tfm->__crt_alg->cra_name directly
  crypto: geode - Weed out printk() from probe()
  crypto: geode - Consistently use AES_KEYSIZE_128
  crypto: geode - Kill AES_IV_LENGTH
  crypto: geode - Kill AES_MIN_BLOCK_SIZE
  crypto: mxs-dcp - Remove global mutex
  crypto: hash - Add real ahash walk interface
  hwrng: n2-drv - Introduce the use of the managed version of kzalloc
  crypto: caam - reinitialize keys_fit_inline for decrypt and givencrypt
  crypto: s5p-sss - fix multiplatform build
  hwrng: timeriomem - remove unnecessary OOM messages
  ...
2014-06-07 19:44:40 -07:00
Matt Fleming
745c51673e x86/boot: EFI_MIXED should not prohibit loading above 4G
commit 7d453eee36 ("x86/efi: Wire up CONFIG_EFI_MIXED") introduced a
regression for the functionality to load kernels above 4G. The relevant
(incorrect) reasoning behind this change can be seen in the commit
message,

  "The xloadflags field in the bzImage header is also updated to reflect
  that the kernel supports both entry points by setting both of
  XLF_EFI_HANDOVER_32 and XLF_EFI_HANDOVER_64 when CONFIG_EFI_MIXED=y.
  XLF_CAN_BE_LOADED_ABOVE_4G is disabled so that the kernel text is
  guaranteed to be addressable with 32-bits."

This is obviously bogus since 32-bit EFI loaders will never place the
kernel above the 4G mark. So this restriction is entirely unnecessary.

But things are worse than that - since we want to encourage people to
always compile with CONFIG_EFI_MIXED=y so that their kernels work out of
the box for both 32-bit and 64-bit firmware, commit 7d453eee36
effectively disables XLF_CAN_BE_LOADED_ABOVE_4G completely.

Remove the overzealous and superfluous restriction and restore the
XLF_CAN_BE_LOADED_ABOVE_4G functionality.

Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Link: http://lkml.kernel.org/r/1402140380-15377-1-git-send-email-matt@console-pimps.org
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-06-07 09:31:00 -07:00
Oleg Nesterov
36fac0a214 signals: kill sigfindinword()
It has no users and it doesn't look useful.  I do not know why/when it was
introduced, I can't even find any user in the git history.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:11 -07:00
H. Peter Anvin
bdfb9bcc25 x86, vdso: Use <tools/le_byteshift.h> for littleendian access
There are no standard functions for littleendian data (unlike
bigendian data.)  Thus, use <tools/le_byteshift.h> to access
littleendian data members.  Those are fairly inefficient, but it
doesn't matter for this purpose (and can be optimized later.)  This
avoids portability problems.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Tested-by: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/20140606140017.afb7f91142f66cb3dd13c186@linux-foundation.org
2014-06-06 14:54:54 -07:00
Waiman Long
bd01ec1a13 x86, locking/rwlocks: Enable qrwlocks on x86
Make x86 use the fair rwlock_t.

Implement the custom queue_write_unlock() for best performance.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
[peterz: near complete rewrite]
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: "Paul E.McKenney" <paulmck@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org
Cc: x86@kernel.org
Link: http://lkml.kernel.org/n/tip-r1xuzmdysvuhl3h86n5fbxi7@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-06 07:58:29 +02:00
Ingo Molnar
ec00010972 Merge branch 'perf/urgent' into perf/core, to resolve conflict and to prepare for new patches
Conflicts:
	arch/x86/kernel/traps.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-06 07:55:06 +02:00
H. Peter Anvin
177875423e * Fix earlyprintk=efi,keep support by switching to an ioremap() mapping
of the framebuffer when early_ioremap() is no longer available and
    dropping __init from functions that may be invoked after
    free_initmem() - Dave Young
 
  * We shouldn't be exporting the EFI runtime map in sysfs if not using
    the new 1:1 EFI mapping code since in that case the mappings are not
    static across a kexec reboot - Dave Young
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJTj0xzAAoJEC84WcCNIz1VgBgP/3rzGcVB6JBT5qAy2SCcKkL1
 zKAvMNiFWpfyOwICE1gbfZzDky9DIVRXpzriLIe27aQzgn/GYBLH9hAIsYO/5W1m
 uw/KRMKbUkvfnUKPxZUMe95OMsXqwhYth7oGJXq1yFa6SsQdCU4jnmMflUQ6IWfy
 sXtQzlV7frFO0W55Gxv4jRB/nxrExtTHm9Fc01H9xvguKO2l0KEh0C9PbyfZLQp0
 FffxGhcVSZXG23fB5eU0yKLV69buDk83O1y4lQ79zETvCTjuzbpGe2agFDxrlJ/u
 wxZROMStvthkHqapheGwwBu4SySoKsuNRQHKxeJ2t3XEVSBuW+c9jiYYMmpDS0t6
 LhIdLdDtBk8ItsTY60r6w86oLvQVTnkwLxwX/XdzUyA/bSzf/EzHENw2iPbJhpjn
 qfq2ZonXD8j/EVPyYXk65wUea4tMmG1QYpi9oPqMp3b4hCgoJr7qosOqvT9v+E3/
 bbGTjr6bgzOaXdctOocmEg0qfhMw/Ol63V9P4wsiN5mUq1nRJnJcKQtqA3Do08IP
 RgZJg8yQAe9FAfZ0RDaRKs5OSPgnC42t/d6pr9JdiFzgR5Ktw7BymfmOu52o7Koq
 EJFf1TlD+Kipdyo/1Q4s9vtZ1ijO5If6FSL9rYJoduWMWCPSZ9by43uCIbEobEQh
 aXD9ZvSyeu5IwD4BoBiO
 =7zSB
 -----END PGP SIGNATURE-----

Merge tag 'efi-urgent' into x86/urgent

 * Fix earlyprintk=efi,keep support by switching to an ioremap() mapping
   of the framebuffer when early_ioremap() is no longer available and
   dropping __init from functions that may be invoked after
   free_initmem() - Dave Young

 * We shouldn't be exporting the EFI runtime map in sysfs if not using
   the new 1:1 EFI mapping code since in that case the mappings are not
   static across a kexec reboot - Dave Young

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-06-05 13:09:44 -07:00
Linus Torvalds
046f153343 Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull x86 EFI updates from Peter Anvin:
 "A collection of EFI changes.  The perhaps most important one is to
  fully save and restore the FPU state around each invocation of EFI
  runtime, and to not choke on non-ASCII characters in the boot stub"

* 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efivars: Add compatibility code for compat tasks
  efivars: Refactor sanity checking code into separate function
  efivars: Stop passing a struct argument to efivar_validate()
  efivars: Check size of user object
  efivars: Use local variables instead of a pointer dereference
  x86/efi: Save and restore FPU context around efi_calls (i386)
  x86/efi: Save and restore FPU context around efi_calls (x86_64)
  x86/efi: Implement a __efi_call_virt macro
  x86, fpu: Extend the use of static_cpu_has_safe
  x86/efi: Delete most of the efi_call* macros
  efi: x86: Handle arbitrary Unicode characters
  efi: Add get_dram_base() helper function
  efi: Add shared printk wrapper for consistent prefixing
  efi: create memory map iteration helper
  efi: efi-stub-helper cleanup
2014-06-05 08:16:29 -07:00
Linus Torvalds
a0abcf2e8f Merge branch 'x86/vdso' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull x86 cdso updates from Peter Anvin:
 "Vdso cleanups and improvements largely from Andy Lutomirski.  This
  makes the vdso a lot less ''special''"

* 'x86/vdso' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/vdso, build: Make LE access macros clearer, host-safe
  x86/vdso, build: Fix cross-compilation from big-endian architectures
  x86/vdso, build: When vdso2c fails, unlink the output
  x86, vdso: Fix an OOPS accessing the HPET mapping w/o an HPET
  x86, mm: Replace arch_vma_name with vm_ops->name for vsyscalls
  x86, mm: Improve _install_special_mapping and fix x86 vdso naming
  mm, fs: Add vm_ops->name as an alternative to arch_vma_name
  x86, vdso: Fix an OOPS accessing the HPET mapping w/o an HPET
  x86, vdso: Remove vestiges of VDSO_PRELINK and some outdated comments
  x86, vdso: Move the vvar and hpet mappings next to the 64-bit vDSO
  x86, vdso: Move the 32-bit vdso special pages after the text
  x86, vdso: Reimplement vdso.so preparation in build-time C
  x86, vdso: Move syscall and sysenter setup into kernel/cpu/common.c
  x86, vdso: Clean up 32-bit vs 64-bit vdso params
  x86, mm: Ensure correct alignment of the fixmap
2014-06-05 08:05:29 -07:00
Linus Torvalds
2071b3e34f Merge branch 'x86/espfix' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull x86-64 espfix changes from Peter Anvin:
 "This is the espfix64 code, which fixes the IRET information leak as
  well as the associated functionality problem.  With this code applied,
  16-bit stack segments finally work as intended even on a 64-bit
  kernel.

  Consequently, this patchset also removes the runtime option that we
  added as an interim measure.

  To help the people working on Linux kernels for very small systems,
  this patchset also makes these compile-time configurable features"

* 'x86/espfix' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Revert "x86-64, modify_ldt: Make support for 16-bit segments a runtime option"
  x86, espfix: Make it possible to disable 16-bit support
  x86, espfix: Make espfix64 a Kconfig option, fix UML
  x86, espfix: Fix broken header guard
  x86, espfix: Move espfix definitions into a separate header file
  x86-32, espfix: Remove filter for espfix32 due to race
  x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack
2014-06-05 07:46:15 -07:00
Linus Torvalds
9df0fe64eb Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull x86 x32 ABI fix from Peter Anvin:
 "A single fix for the x32 ABI: the io_setup() and io_submit() system
  call need to use the compat stubs"

* 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, x32: Use compat shims for io_{setup,submit}
2014-06-05 07:38:07 -07:00
Ingo Molnar
8c6e549a44 Merge branch 'uprobes/core' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc into perf/core
Pull uprobes tmpfs support patches from Oleg Nesterov.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05 16:35:30 +02:00
Igor Mammedov
3e1a878b7c x86/smpboot: Initialize secondary CPU only if master CPU will wait for it
Hang is observed on virtual machines during CPU hotplug,
especially in big guests with many CPUs. (It reproducible
more often if host is over-committed).

It happens because master CPU gives up waiting on
secondary CPU and allows it to run wild. As result
AP causes locking or crashing system. For example
as described here:

   https://lkml.org/lkml/2014/3/6/257

If master CPU have sent STARTUP IPI successfully,
and AP signalled to master CPU that it's ready
to start initialization, make master CPU wait
indefinitely till AP is onlined.
To ensure that AP won't ever run wild, make it
wait at early startup till master CPU confirms its
intention to wait for AP. If AP doesn't respond in 10
seconds, the master CPU will timeout and cancel
AP onlining.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1401975765-22328-4-git-send-email-imammedo@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05 16:33:08 +02:00
Igor Mammedov
feef1e8ecb x86/smpboot: Log error on secondary CPU wakeup failure at ERR level
If system is running without debug level logging,
it will not log error if do_boot_cpu() failed to
wakeup AP. It may lead to silent AP bringup
failures at boot time.
Change message level to KERN_ERR to make error
visible to user as it's done on other architectures.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1401975765-22328-3-git-send-email-imammedo@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05 16:33:07 +02:00
Igor Mammedov
89f898c1e1 x86: Fix list/memory corruption on CPU hotplug
currently if AP wake up is failed, master CPU marks AP as not
present in do_boot_cpu() by calling set_cpu_present(cpu, false).
That leads to following list corruption on the next physical CPU
hotplug:

[  418.107336] WARNING: CPU: 1 PID: 45 at lib/list_debug.c:33 __list_add+0xbe/0xd0()
[  418.115268] list_add corruption. prev->next should be next (ffff88003dc57600), but was ffff88003e20c3a0. (prev=ffff88003e20c3a0).
[  418.123693] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT ipt_REJECT cfg80211 xt_conntrack rfkill ee
[  418.138979] CPU: 1 PID: 45 Comm: kworker/u10:1 Not tainted 3.14.0-rc6+ #387
[  418.149989] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[  418.165750] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
[  418.166433]  0000000000000021 ffff880038ca7988 ffffffff8159b22d 0000000000000021
[  418.176460]  ffff880038ca79d8 ffff880038ca79c8 ffffffff8106942c ffff880038ca79e8
[  418.177453]  ffff88003e20c3a0 ffff88003dc57600 ffff88003e20c3a0 00000000ffffffea
[  418.178445] Call Trace:
[  418.185811]  [<ffffffff8159b22d>] dump_stack+0x49/0x5c
[  418.186440]  [<ffffffff8106942c>] warn_slowpath_common+0x8c/0xc0
[  418.187192]  [<ffffffff81069516>] warn_slowpath_fmt+0x46/0x50
[  418.191231]  [<ffffffff8136ef51>] ? acpi_ns_get_node+0xb7/0xc7
[  418.193889]  [<ffffffff812f796e>] __list_add+0xbe/0xd0
[  418.196649]  [<ffffffff812e2aa9>] kobject_add_internal+0x79/0x200
[  418.208610]  [<ffffffff812e2e18>] kobject_add_varg+0x38/0x60
[  418.213831]  [<ffffffff812e2ef4>] kobject_add+0x44/0x70
[  418.229961]  [<ffffffff813e2c60>] device_add+0xd0/0x550
[  418.234991]  [<ffffffff813f0e95>] ? pm_runtime_init+0xe5/0xf0
[  418.250226]  [<ffffffff813e32be>] device_register+0x1e/0x30
[  418.255296]  [<ffffffff813e82a3>] register_cpu+0xe3/0x130
[  418.266539]  [<ffffffff81592be5>] arch_register_cpu+0x65/0x150
[  418.285845]  [<ffffffff81355c0d>] acpi_processor_hotadd_init+0x5a/0x9b
...
Which is caused by the fact that generic_processor_info() allocates
logical CPU id by calling:

 cpu = cpumask_next_zero(-1, cpu_present_mask);

which returns id of previously failed to wake up CPU, since its
bit is cleared by do_boot_cpu() and as result register_cpu()
tries to register another CPU with the same id as already
present but failed to be onlined CPU.

Taking in account that AP will not do anything if master CPU
failed to wake it up, there is no reason to mark that AP as not
present and break next cpu hotplug attempts. As a side effect of
not marking AP as not present, user would be allowed to online
it again later.

Also fix memory corruption in acpi_unmap_lsapic()

if during CPU hotplug master CPU failed to wake up AP
it set percpu x86_cpu_to_apicid to BAD_APICID=0xFFFF for AP.

However following attempt to unplug that CPU will lead to
out of bound write access to __apicid_to_node[] which is
32768 items long on x86_64 kernel.

So with above fix of cpu_present_mask make sure that a present
CPU has a valid APIC ID by not setting x86_cpu_to_apicid
to BAD_APICID in do_boot_cpu() on failure and allow
acpi_processor_remove()->acpi_unmap_lsapic() cleanly remove CPU.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1401975765-22328-2-git-send-email-imammedo@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05 16:33:07 +02:00
Oleg Nesterov
5cdb76d6f0 uprobes/x86: Rename arch_uprobe->def to ->defparam, minor comment updates
Purely cosmetic, no changes in .o,

1. As Jim pointed out arch_uprobe->def looks ambiguous, rename it to
   ->defparam.

2. Add the comment into default_post_xol_op() to explain "regs->sp +=".

3. Remove the stale part of the comment in arch_uprobe_analyze_insn().

Suggested-by: Jim Keniston <jkenisto@us.ibm.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2014-06-05 16:21:57 +02:00
Anshuman Khandual
37548914fb perf/x86: Add conditional branch filtering support
This patch adds conditional branch filtering support,
enabling it for PERF_SAMPLE_BRANCH_COND in perf branch
stack sampling framework by utilizing an available
software filter X86_BR_JCC.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Reviewed-by: Stephane Eranian <eranian@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: mpe@ellerman.id.au
Cc: benh@kernel.crashing.org
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1400743210-32289-3-git-send-email-khandual@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05 12:30:23 +02:00
Vince Weaver
c184c980de perf/x86: Use common PMU interrupt disabled code
Make the x86 perf code use the new common PMU interrupt disabled code.

Typically most x86 machines have working PMU interrupts, although
some older p6-class machines had this problem.

Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1405161715560.11099@vincent-weaver-1.umelst.maine.edu
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05 12:30:03 +02:00
Dave Airlie
8d4ad9d4bb Merge commit '9e9a928eed8796a0a1aaed7e0b676db86ba84594' into drm-next
Merge drm-fixes into drm-next.

Both i915 and radeon need this done for later patches.

Conflicts:
	drivers/gpu/drm/drm_crtc_helper.c
	drivers/gpu/drm/i915/i915_drv.h
	drivers/gpu/drm/i915/i915_gem.c
	drivers/gpu/drm/i915/i915_gem_execbuffer.c
	drivers/gpu/drm/i915/i915_gem_gtt.c
2014-06-05 20:28:59 +10:00
Ingo Molnar
10b0256496 Merge branch 'perf/kprobes' into perf/core
Conflicts:
	arch/x86/kernel/traps.c

The kprobes enhancements are fully cooked, ship them upstream.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05 12:26:50 +02:00
Ingo Molnar
c56d34064b Merge branch 'perf/uprobes' into perf/core
These bits from Oleg are fully cooked, ship them to Linus.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05 12:26:27 +02:00
Linus Torvalds
00170fdd08 Merge branch 'akpm' (patchbomb from Andrew) into next
Merge misc updates from Andrew Morton:

 - a few fixes for 3.16.  Cc'ed to stable so they'll get there somehow.

 - various misc fixes and cleanups

 - most of the ocfs2 queue.  Review is slow...

 - most of MM.  The MM queue is pretty huge this time, but not much in
   the way of feature work.

 - some tweaks under kernel/

 - printk maintenance work

 - updates to lib/

 - checkpatch updates

 - tweaks to init/

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (276 commits)
  fs/autofs4/dev-ioctl.c: add __init to autofs_dev_ioctl_init
  fs/ncpfs/getopt.c: replace simple_strtoul by kstrtoul
  init/main.c: remove an ifdef
  kthreads: kill CLONE_KERNEL, change kernel_thread(kernel_init) to avoid CLONE_SIGHAND
  init/main.c: add initcall_blacklist kernel parameter
  init/main.c: don't use pr_debug()
  fs/binfmt_flat.c: make old_reloc() static
  fs/binfmt_elf.c: fix bool assignements
  fs/efs: convert printk(KERN_DEBUG to pr_debug
  fs/efs: add pr_fmt / use __func__
  fs/efs: convert printk to pr_foo()
  scripts/checkpatch.pl: device_initcall is not the only __initcall substitute
  checkpatch: check stable email address
  checkpatch: warn on unnecessary void function return statements
  checkpatch: prefer kstrto<foo> to sscanf(buf, "%<lhuidx>", &bar);
  checkpatch: add warning for kmalloc/kzalloc with multiply
  checkpatch: warn on #defines ending in semicolon
  checkpatch: make --strict a default for files in drivers/net and net/
  checkpatch: always warn on missing blank line after variable declaration block
  checkpatch: fix wildcard DT compatible string checking
  ...
2014-06-04 16:55:13 -07:00
Borislav Petkov
a8fe19ebfb kernel/printk: use symbolic defines for console loglevels
... instead of naked numbers.

Stuff in sysrq.c used to set it to 8 which is supposed to mean above
default level so set it to DEBUG instead as we're terminating/killing all
tasks and we want to be verbose there.

Also, correct the check in x86_64_start_kernel which should be >= as
we're clearly issuing the string there for all debug levels, not only
the magical 10.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Joe Perches <joe@perches.com>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:17 -07:00
Fabian Frederick
f6187769da sys_sgetmask/sys_ssetmask: add CONFIG_SGETMASK_SYSCALL
sys_sgetmask and sys_ssetmask are obsolete system calls no longer
supported in libc.

This patch replaces architecture related __ARCH_WANT_SYS_SGETMAX by expert
mode configuration.That option is enabled by default for those
architectures.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Greg Ungerer <gerg@uclinux.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:14 -07:00
Chen Yucong
65eb71823b hwpoison: remove unused global variable in do_machine_check()
Remove an unused global variable mce_entry and relative operations in
do_machine_check().

Signed-off-by: Chen Yucong <slaoub@gmail.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:11 -07:00
Emil Medve
af4459d363 arch/x86/mm/numa.c: use for_each_memblock()
Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:05 -07:00
Cyrill Gorcunov
2bf01f9f0c mm: x86 pgtable: require X86_64 for soft-dirty tracker
Tracking dirty status on 2 level pages requires very ugly macros and
taking into account how old the machines who can operate without PAE
mode only are, lets drop soft dirty tracker from them for code
simplicity (note I can't drop all the macros from 2 level pages by now
since _PAGE_BIT_PROTNONE and _PAGE_BIT_FILE are still used even without
tracker).

Linus proposed to completely rip off softdirty support on x86-32 (even
with PAE) and since for CRIU we're not planning to support native x86-32
mode, lets do that.

(Softdirty tracker is relatively new feature which is mostly used by
CRIU so I don't expect if such API change would cause problems for
userspace).

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Steven Noonan <steven@uplinklabs.net>
Cc: Rik van Riel <riel@redhat.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:05 -07:00
Cyrill Gorcunov
2373eaecff mm: x86 pgtable: drop unneeded preprocessor ifdef
_PAGE_BIT_FILE (bit 6) is always less than _PAGE_BIT_PROTNONE (bit 8), so
drop redundant #ifdef.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Steven Noonan <steven@uplinklabs.net>
Cc: Rik van Riel <riel@redhat.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:05 -07:00
Akinobu Mita
38f7ea5a08 arch/x86/kernel/pci-dma.c: fix dma_generic_alloc_coherent() when CONFIG_DMA_CMA is enabled
dma_generic_alloc_coherent() firstly attempts to allocate by
dma_alloc_from_contiguous() if CONFIG_DMA_CMA is enabled.  But the
memory region allocated by it may not fit within the device's DMA mask.
This change makes it fall back to usual alloc_pages_node() allocation
for such cases.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Don Dutile <ddutile@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:53:57 -07:00
Akinobu Mita
5ea3b1b2f8 cma: add placement specifier for "cma=" kernel parameter
Currently, "cma=" kernel parameter is used to specify the size of CMA,
but we can't specify where it is located.  We want to locate CMA below
4GB for devices only supporting 32-bit addressing on 64-bit systems
without iommu.

This enables to specify the placement of CMA by extending "cma=" kernel
parameter.

Examples:
 1. locate 64MB CMA below 4GB by "cma=64M@0-4G"
 2. locate 64MB CMA exact at 512MB by "cma=64M@512M"

Note that the DMA contiguous memory allocator on x86 assumes that
page_address() works for the pages to allocate.  So this change requires
to limit end address of contiguous memory area upto max_pfn_mapped to
prevent from locating it on highmem area by the argument of
dma_contiguous_reserve().

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Don Dutile <ddutile@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:53:57 -07:00
Akinobu Mita
9c5a362142 x86: enable DMA CMA with swiotlb
The DMA Contiguous Memory Allocator support on x86 is disabled when
swiotlb config option is enabled.  So DMA CMA is always disabled on
x86_64 because swiotlb is always enabled.  This attempts to support for
DMA CMA with enabling swiotlb config option.

The contiguous memory allocator on x86 is integrated in the function
dma_generic_alloc_coherent() which is .alloc callback in nommu_dma_ops
for dma_alloc_coherent().

x86_swiotlb_alloc_coherent() which is .alloc callback in swiotlb_dma_ops
tries to allocate with dma_generic_alloc_coherent() firstly and then
swiotlb_alloc_coherent() is called as a fallback.

The main part of supporting DMA CMA with swiotlb is that changing
x86_swiotlb_free_coherent() which is .free callback in swiotlb_dma_ops
for dma_free_coherent() so that it can distinguish memory allocated by
dma_generic_alloc_coherent() from one allocated by
swiotlb_alloc_coherent() and release it with dma_generic_free_coherent()
which can handle contiguous memory.  This change requires making
is_swiotlb_buffer() global function.

This also needs to change .free callback in the dma_map_ops for amd_gart
and sta2x11, because these dma_ops are also using
dma_generic_alloc_coherent().

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Don Dutile <ddutile@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:53:57 -07:00
Akinobu Mita
d92ef66c4f x86: make dma_alloc_coherent() return zeroed memory if CMA is enabled
This patchset enhances the DMA Contiguous Memory Allocator on x86.

Currently the DMA CMA is only supported with pci-nommu dma_map_ops and
furthermore it can't be enabled on x86_64.  But I would like to allocate
big contiguous memory with dma_alloc_coherent() and tell it to the device
that requires it, regardless of which dma mapping implementation is
actually used in the system.

So this makes it work with swiotlb and intel-iommu dma_map_ops, too.  And
this also extends "cma=" kernel parameter to specify placement constraint
by the physical address range of memory allocations.  For example, CMA
allocates memory below 4GB by "cma=64M@0-4G", it is required for the
devices only supporting 32-bit addressing on 64-bit systems without iommu.

This patch (of 5):

Calling dma_alloc_coherent() with __GFP_ZERO must return zeroed memory.

But when the contiguous memory allocator (CMA) is enabled on x86 and the
memory region is allocated by dma_alloc_from_contiguous(), it doesn't
return zeroed memory.  Because dma_generic_alloc_coherent() forgot to fill
the memory region with zero if it was allocated by
dma_alloc_from_contiguous()

Most implementations of dma_alloc_coherent() return zeroed memory
regardless of whether __GFP_ZERO is specified.  So this fixes it by
unconditionally zeroing the allocated memory region.

Alternatively, we could fix dma_alloc_from_contiguous() to return zeroed
out memory and remove memset() from all caller of it.  But we can't simply
remove the memset on arm because __dma_clear_buffer() is used there for
ensuring cache flushing and it is used in many places.  Of course we can
do redundant memset in dma_alloc_from_contiguous(), but I think this patch
is less impact for fixing this problem.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Don Dutile <ddutile@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:53:57 -07:00
Yinghai Lu
982792c782 x86, mm: probe memory block size for generic x86 64bit
On system with 2TiB ram, current x86_64 have 128M as section size, and
one memory_block only include one section.  So will have 16400 entries
under /sys/devices/system/memory/.

Current code try to use block id to find block pointer in /sys for any
section, and reuse that block pointer.  that finding will take some time
even after commit 7c243c7168 ("mm: speedup in __early_pfn_to_nid")
that will skip the search in that case during booting up.

So solution could be increase block size just like SGI UV system did.
(harded code to 2g).

This patch is trying to probe the block size to make it match mmio remap
size.  for example, Intel Nehalem later system will have memory range [0,
TOML), [4g, TOMH].  If the memory hole is 2g and total is 128g, TOM will
be 2g, and TOM2 will be 130g.

We could use 2g as block size instead of default 128M.  That will reduce
number of entries in /sys/devices/system/memory/

On system 6TiB system will reduce boot time by 35 seconds.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:53:55 -07:00
Mel Gorman
c46a7c817e x86: define _PAGE_NUMA by reusing software bits on the PMD and PTE levels
_PAGE_NUMA is currently an alias of _PROT_PROTNONE to trap NUMA hinting
faults on x86.  Care is taken such that _PAGE_NUMA is used only in
situations where the VMA flags distinguish between NUMA hinting faults
and prot_none faults.  This decision was x86-specific and conceptually
it is difficult requiring special casing to distinguish between PROTNONE
and NUMA ptes based on context.

Fundamentally, we only need the _PAGE_NUMA bit to tell the difference
between an entry that is really unmapped and a page that is protected
for NUMA hinting faults as if the PTE is not present then a fault will
be trapped.

Swap PTEs on x86-64 use the bits after _PAGE_GLOBAL for the offset.
This patch shrinks the maximum possible swap size and uses the bit to
uniquely distinguish between NUMA hinting ptes and swap ptes.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Noonan <steven@uplinklabs.net>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:53:55 -07:00
Mel Gorman
4468dd76f5 x86: require x86-64 for automatic NUMA balancing
32-bit support for NUMA is an oddity on its own but with automatic NUMA
balancing on top there is a reasonable risk that the CPUPID information
cannot be stored in the page flags.  This patch removes support for
automatic NUMA support on 32-bit x86.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Noonan <steven@uplinklabs.net>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:53:55 -07:00
Naoya Horiguchi
c177c81e09 hugetlb: restrict hugepage_migration_support() to x86_64
Currently hugepage migration is available for all archs which support
pmd-level hugepage, but testing is done only for x86_64 and there're
bugs for other archs.  So to avoid breaking such archs, this patch
limits the availability strictly to x86_64 until developers of other
archs get interested in enabling this feature.

Simply disabling hugepage migration on non-x86_64 archs is not enough to
fix the reported problem where sys_move_pages() hits the BUG_ON() in
follow_page(FOLL_GET), so let's fix this by checking if hugepage
migration is supported in vma_migratable().

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Miller <davem@davemloft.net>
Cc: <stable@vger.kernel.org>	[3.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:53:51 -07:00
Linus Torvalds
d09cc3659d Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull core irq updates from Thomas Gleixner:
 "The irq department delivers:

   - Another tree wide update to get rid of the horrible create_irq
     interface along with its even more horrible variants.  That also
     gets rid of the last leftovers of the initial sparse irq hackery.
     arch/driver specific changes have been either acked or ignored.

   - A fix for the spurious interrupt detection logic with threaded
     interrupts.

   - A new ARM SoC interrupt controller

   - The usual pile of fixes and improvements all over the place"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
  Documentation: brcmstb-l2: Add Broadcom STB Level-2 interrupt controller binding
  irqchip: brcmstb-l2: Add Broadcom Set Top Box Level-2 interrupt controller
  genirq: Improve documentation to match current implementation
  ARM: iop13xx: fix msi support with sparse IRQ
  genirq: Provide !SMP stub for irq_set_affinity_notifier()
  irqchip: armada-370-xp: Move the devicetree binding documentation
  irqchip: gic: Use mask field in GICC_IAR
  genirq: Remove dynamic_irq mess
  ia64: Use irq_init_desc
  genirq: Replace dynamic_irq_init/cleanup
  genirq: Remove irq_reserve_irq[s]
  genirq: Replace reserve_irqs in core code
  s390: Avoid call to irq_reserve_irqs()
  s390: Remove pointless arch_show_interrupts()
  s390: pci: Check return value of alloc_irq_desc() proper
  sh: intc: Remove pointless irq_reserve_irqs() invocation
  x86, irq: Remove pointless irq_reserve_irqs() call
  genirq: Make create/destroy_irq() ia64 private
  tile: Use SPARSE_IRQ
  tile: pci: Use irq_alloc/free_hwirq()
  ...
2014-06-04 15:59:13 -07:00
Linus Torvalds
d27050641e DeviceTree for 3.16:
- Another round of clean-up of FDT related code in architecture code.
   This removes knowledge of internal FDT details from most architectures
   except powerpc.
 - Conversion of kernel's custom FDT parsing code to use libfdt.
 - DT based initialization for generic serial earlycon. The introduction
   of generic serial earlycon support went in thru tty tree.
 - Improve the platform device naming for DT probed devices to ensure
   unique naming and use parent names instead of a global index.
 - Fix a race condition in of_update_property.
 - Unify the various linker section OF match tables and fix several
   function prototype errors.
 - Update platform_get_irq_byname to work in deferred probe cases.
 - 2 binding doc updates
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJTjzgyAAoJEMhvYp4jgsXiFsUH/1PMTGo8CyD62VQD5ZKdAoW+
 Fq6vCiRQ8assF5i5ZLcW1DqhjtoRaCKYhVbRKa5lj7cZdjlSpacI/qQPrF5Br2Ii
 bTE3Ff/AQwipQaz/Bj7HqJCgGwfWK8xdfgW0abKsyXMWDN86Bov/zzeu8apmws0x
 H1XjJRgnc/rzM4m9ny6+lss0iq6YL54SuTYNzHR33+Ywxls69SfHXIhCW0KpZcBl
 5U3YUOomt40GfO46sxFA4xApAhypEK4oVq7asyiA2ArTZ/c2Pkc9p5CBqzhDLmlq
 yioWTwHIISv0q+yMLCuQrVGIsbUDkQyy7RQ15z6U+/e/iGO/M+j3A5yxMc3qOi4=
 =Onff
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux into next

Pull DeviceTree updates from Rob Herring:
 - Another round of clean-up of FDT related code in architecture code.
   This removes knowledge of internal FDT details from most
   architectures except powerpc.
 - Conversion of kernel's custom FDT parsing code to use libfdt.
 - DT based initialization for generic serial earlycon.  The
   introduction of generic serial earlycon support went in through the
   tty tree.
 - Improve the platform device naming for DT probed devices to ensure
   unique naming and use parent names instead of a global index.
 - Fix a race condition in of_update_property.
 - Unify the various linker section OF match tables and fix several
   function prototype errors.
 - Update platform_get_irq_byname to work in deferred probe cases.
 - 2 binding doc updates

* tag 'devicetree-for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (58 commits)
  of: handle NULL node in next_child iterators
  of/irq: provide more wrappers for !CONFIG_OF
  devicetree: bindings: Document micrel vendor prefix
  dt: bindings: dwc2: fix required value for the phy-names property
  of_pci_irq: kill useless variable in of_irq_parse_pci()
  of/irq: do irq resolution in platform_get_irq_byname()
  of: Add a testcase for of_find_node_by_path()
  of: Make of_find_node_by_path() handle /aliases
  of: Create unlocked version of for_each_child_of_node()
  lib: add glibc style strchrnul() variant
  of: Handle memory@0 node on PPC32 only
  pci/of: Remove dead code
  of: fix race between search and remove in of_update_property()
  of: Use NULL for pointers
  of: Stop naming platform_device using dcr address
  of: Ensure unique names without sacrificing determinism
  tty/serial: pl011: add DT based earlycon support
  of/fdt: add FDT serial scanning for earlycon
  of/fdt: add FDT address translation support
  serial: earlycon: add DT support
  ...
2014-06-04 10:02:38 -07:00