Commit Graph

5193 Commits

Author SHA1 Message Date
Bjorn Andersson
28978713c5 net: qrtr: Move constants to header file
The constants are used by both the name server and clients, so clarify
their value and move them to the uapi header.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-11 15:28:38 -07:00
David S. Miller
df2fd38a08 Work continues in various areas:
* port authorized event for 4-way-HS offload (Avi)
  * enable MFP optional for such devices (Emmanuel)
  * Kees's timer setup patch for mac80211 mesh
    (the part that isn't trivially scripted)
  * improve VLAN vs. TXQ handling (myself)
  * load regulatory database as firmware file (myself)
  * with various other small improvements and cleanups
 
 I merged net-next once in the meantime to allow Kees's
 timer setup patch to go in.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEExu3sM/nZ1eRSfR9Ha3t4Rpy0AB0FAlneDzEACgkQa3t4Rpy0
 AB3EHBAAhQana6YiMx0Ag4ANGlll3xnxFCZlkmlBoJ/EwKgQhPonylHntuvtkXf6
 kZRsOr4uA+wpN/opHLGfMJzat9uxztHVo2sT4rxVnvZq4DYcB/JdlhTMLZDsdDgm
 kHRpUEKh/+2FAgq2A4VEUpVb+Mtg0dq8iJJXFw89xb3Sw5UhNA6ljWQZ4zpXuI0P
 xOB8Z52LqAcMNnspP+L2TRpanu2ETLcl4Laj+cMl1Yiut2GHkclXUoGvbZ1al5SO
 CYqpjVKk67ENLJMrmhQ7DVzj0rpwlV+Eh756RU9DhamPAWbxqWLWJgfuGBskRXnI
 GneCUQkLZ5j1kUJjvQdXBv1UmpkCG4/3yITZX8kL3UR+AbhSCqzVQDo7it5hsWEf
 XTNAlhdTDhSn7OQQ6XOxvWeydAiaaz671bhPuIvKEo9D/+7Uv0PxHmvu8QqUm0xH
 Wvyh0LYRrblDz7fgEkaFctjJKYKnwviQ9O2LGx98C8NVam+Qyti2MlLA4AO5E+it
 ky97W3Dh5ftjQhFD0Ip9P4+BO/9hvNELlCRWUXI197n6B0/KH7FWX1eqw/vpnKc4
 w7VB/V59mB8zMmZ1QUdwT1/Ru+MD++6ds93STttZvH/0P3H0dDRGuxUK4m32YHiX
 s97uSBAbBMy2UH6b8HyxjVMGWvmW3KRakBID1zv2NRSIXtyfWj4=
 =gW8q
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2017-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
Work continues in various areas:
 * port authorized event for 4-way-HS offload (Avi)
 * enable MFP optional for such devices (Emmanuel)
 * Kees's timer setup patch for mac80211 mesh
   (the part that isn't trivially scripted)
 * improve VLAN vs. TXQ handling (myself)
 * load regulatory database as firmware file (myself)
 * with various other small improvements and cleanups

I merged net-next once in the meantime to allow Kees's
timer setup patch to go in.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-11 10:15:01 -07:00
Johannes Berg
1ea4ff3e9f cfg80211: support reloading regulatory database
If the regulatory database is loaded, and then updated, it may
be necessary to reload it. Add an nl80211 command to do this.

Note that this just reloads the database, it doesn't re-apply
the rules from it immediately.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-10-11 13:04:15 +02:00
Eric Garver
b8226962b1 openvswitch: add ct_clear action
This adds a ct_clear action for clearing conntrack state. ct_clear is
currently implemented in OVS userspace, but is not backed by an action
in the kernel datapath. This is useful for flows that may modify a
packet tuple after a ct lookup has already occurred.

Signed-off-by: Eric Garver <e@erig.me>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10 16:38:34 -07:00
William Tu
ceaa001a17 openvswitch: Add erspan tunnel support.
Add erspan netlink interface for OVS.

Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 20:45:50 -07:00
David S. Miller
d93fa2ba64 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-10-09 20:11:09 -07:00
David S. Miller
fb60bccc06 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

The following patchset contains Netfilter/IPVS fixes for your net tree,
they are:

1) Fix packet drops due to incorrect ECN handling in IPVS, from Vadim
   Fedorenko.

2) Fix splat with mark restoration in xt_socket with non-full-sock,
   patch from Subash Abhinov Kasiviswanathan.

3) ipset bogusly bails out when adding IPv4 range containing more than
   2^31 addresses, from Jozsef Kadlecsik.

4) Incorrect pernet unregistration order in ipset, from Florian Westphal.

5) Races between dump and swap in ipset results in BUG_ON splats, from
   Ross Lagerwall.

6) Fix chain renames in nf_tables, from JingPiao Chen.

7) Fix race in pernet codepath with ebtables table registration, from
   Artem Savkov.

8) Memory leak in error path in set name allocation in nf_tables, patch
   from Arvind Yadav.

9) Don't dump chain counters if they are not available, this fixes a
   crash when listing the ruleset.

10) Fix out of bound memory read in strlcpy() in x_tables compat code,
    from Eric Dumazet.

11) Make sure we only process TCP packets in SYNPROXY hooks, patch from
    Lin Zhang.

12) Cannot load rules incrementally anymore after xt_bpf with pinned
    objects, added in revision 1. From Shmulik Ladkani.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 10:39:52 -07:00
Shmulik Ladkani
98589a0998 netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1'
Commit 2c16d60332 ("netfilter: xt_bpf: support ebpf") introduced
support for attaching an eBPF object by an fd, with the
'bpf_mt_check_v1' ABI expecting the '.fd' to be specified upon each
IPT_SO_SET_REPLACE call.

However this breaks subsequent iptables calls:

 # iptables -A INPUT -m bpf --object-pinned /sys/fs/bpf/xxx -j ACCEPT
 # iptables -A INPUT -s 5.6.7.8 -j ACCEPT
 iptables: Invalid argument. Run `dmesg' for more information.

That's because iptables works by loading existing rules using
IPT_SO_GET_ENTRIES to userspace, then issuing IPT_SO_SET_REPLACE with
the replacement set.

However, the loaded 'xt_bpf_info_v1' has an arbitrary '.fd' number
(from the initial "iptables -m bpf" invocation) - so when 2nd invocation
occurs, userspace passes a bogus fd number, which leads to
'bpf_mt_check_v1' to fail.

One suggested solution [1] was to hack iptables userspace, to perform a
"entries fixup" immediatley after IPT_SO_GET_ENTRIES, by opening a new,
process-local fd per every 'xt_bpf_info_v1' entry seen.

However, in [2] both Pablo Neira Ayuso and Willem de Bruijn suggested to
depricate the xt_bpf_info_v1 ABI dealing with pinned ebpf objects.

This fix changes the XT_BPF_MODE_FD_PINNED behavior to ignore the given
'.fd' and instead perform an in-kernel lookup for the bpf object given
the provided '.path'.

It also defines an alias for the XT_BPF_MODE_FD_PINNED mode, named
XT_BPF_MODE_PATH_PINNED, to better reflect the fact that the user is
expected to provide the path of the pinned object.

Existing XT_BPF_MODE_FD_ELF behavior (non-pinned fd mode) is preserved.

References: [1] https://marc.info/?l=netfilter-devel&m=150564724607440&w=2
            [2] https://marc.info/?l=netfilter-devel&m=150575727129880&w=2

Reported-by: Rafael Buchbinder <rafi@rbk.ms>
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-10-09 15:18:04 +02:00
Roopa Prabhu
821f1b21ca bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood
This patch adds a new bridge port flag BR_NEIGH_SUPPRESS to
suppress arp and nd flood on bridge ports. It implements
rfc7432, section 10.
https://tools.ietf.org/html/rfc7432#section-10
for ethernet VPN deployments. It is similar to the existing
BR_PROXYARP* flags but has a few semantic differences to conform
to EVPN standard. Unlike the existing flags, this new flag suppresses
flood of all neigh discovery packets (arp and nd) to tunnel ports.
Supports both vlan filtering and non-vlan filtering bridges.

In case of EVPN, it is mainly used to avoid flooding
of arp and nd packets to tunnel ports like vxlan.

This patch adds netlink and sysfs support to set this bridge port
flag.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-08 21:12:04 -07:00
Martin KaFai Lau
067cae4777 bpf: Use char in prog and map name
Instead of u8, use char for prog and map name.  It can avoid the
userspace tool getting compiler's signess warning.  The
bpf_prog_aux, bpf_map, bpf_attr, bpf_prog_info and
bpf_map_info are changed.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-07 23:29:39 +01:00
Yonghong Song
4bebdc7a85 bpf: add helper bpf_perf_prog_read_value
This patch adds helper bpf_perf_prog_read_cvalue for perf event based bpf
programs, to read event counter and enabled/running time.
The enabled/running time is accumulated since the perf event open.

The typical use case for perf event based bpf program is to attach itself
to a single event. In such cases, if it is desirable to get scaling factor
between two bpf invocations, users can can save the time values in a map,
and use the value from the map and the current value to calculate
the scaling factor.

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-07 23:05:57 +01:00
Yonghong Song
908432ca84 bpf: add helper bpf_perf_event_read_value for perf event array map
Hardware pmu counters are limited resources. When there are more
pmu based perf events opened than available counters, kernel will
multiplex these events so each event gets certain percentage
(but not 100%) of the pmu time. In case that multiplexing happens,
the number of samples or counter value will not reflect the
case compared to no multiplexing. This makes comparison between
different runs difficult.

Typically, the number of samples or counter value should be
normalized before comparing to other experiments. The typical
normalization is done like:
  normalized_num_samples = num_samples * time_enabled / time_running
  normalized_counter_value = counter_value * time_enabled / time_running
where time_enabled is the time enabled for event and time_running is
the time running for event since last normalization.

This patch adds helper bpf_perf_event_read_value for kprobed based perf
event array map, to read perf counter and enabled/running time.
The enabled/running time is accumulated since the perf event open.
To achieve scaling factor between two bpf invocations, users
can can use cpu_id as the key (which is typical for perf array usage model)
to remember the previous value and do the calculation inside the
bpf program.

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-07 23:05:57 +01:00
Amine Kherbouche
bdc476413d ip_tunnel: add mpls over gre support
This commit introduces the MPLSoGRE support (RFC 4023), using ip tunnel
API by simply adding ipgre_tunnel_encap_(add|del)_mpls_ops() and the new
tunnel type TUNNEL_ENCAP_MPLS.

Signed-off-by: Amine Kherbouche <amine.kherbouche@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-07 21:38:31 +01:00
Johannes Berg
753d179ad0 Merge remote-tracking branch 'net-next/master' into mac80211-next
Merging this brings in the timer_setup() change, which allows
me to apply Kees's mac80211 changes for it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-10-06 11:46:55 +02:00
Stefan Hajnoczi
413a4317ac VSOCK: add sock_diag interface
This patch adds the sock_diag interface for querying sockets from
userspace.  Tools like ss(8) and netstat(8) can use this interface to
list open sockets.

The userspace ABI is defined in <linux/vm_sockets_diag.h> and includes
netlink request and response structs.  The request can query sockets
based on their sk_state (e.g. listening sockets only) and the response
contains socket information fields including the local/remote addresses,
inode number, etc.

This patch does not dump VMCI pending sockets because I have only tested
the virtio transport, which does not use pending sockets.  Support can
be added later by extending vsock_diag_dump() if needed by VMCI users.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05 18:44:17 -07:00
David S. Miller
53954cf8c5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Just simple overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05 18:19:22 -07:00
Linus Torvalds
076264ada9 - A stable fix for the alignment of the event number reported at the
end of the 'DM_LIST_DEVICES' ioctl.
 
 - A couple stable fixes for the DM crypt target.
 
 - A DM raid health status reporting fix.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJZ1pR1AAoJEMUj8QotnQNa48kIAJ+HTqeNjVhspxqKyJHPl78W
 3N/B11dWJ/CQ4xN7tbpC2gmsbnBBHE8RFTJzk3xQo7yoKsD0muqH35n0XA7X2A29
 i7DoYro/7F6ZuPlgzhzcCjA7eTugR4vcp5dTFYoIQG0DaOKAkN/+gJTVjNDjpRR5
 oGljZhKTeS4UNJTv/+ZjSMuAPycZq8LKRMOn/EgqT9MD4cIQ9VHN2qGc8jQt0Xrb
 m58URvAoFesGnSjZcypk+JG2SbUfJ4WB3Db7+A+X7lu2219FIroFhNHMk9obYhXG
 mkrhEnAsVsq/paPhCY4gdXWmSe7RNiAeSJeWhUSrNfjUACf1GF+l4CgBeBWIX+0=
 =V40h
 -----END PGP SIGNATURE-----

Merge tag 'for-4.14/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - a stable fix for the alignment of the event number reported at the
   end of the 'DM_LIST_DEVICES' ioctl.

 - a couple stable fixes for the DM crypt target.

 - a DM raid health status reporting fix.

* tag 'for-4.14/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm raid: fix incorrect status output at the end of a "recover" process
  dm crypt: reject sector_size feature if device length is not aligned to it
  dm crypt: fix memory leak in crypt_ctr_cipher_old()
  dm ioctl: fix alignment of event number in the device list
2017-10-05 15:17:40 -07:00
Linus Torvalds
9a431ef962 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Check iwlwifi 9000 reorder buffer out-of-space condition properly,
    from Sara Sharon.

 2) Fix RCU splat in qualcomm rmnet driver, from Subash Abhinov
    Kasiviswanathan.

 3) Fix session and tunnel release races in l2tp, from Guillaume Nault
    and Sabrina Dubroca.

 4) Fix endian bug in sctp_diag_dump(), from Dan Carpenter.

 5) Several mlx5 driver fixes from the Mellanox folks (max flow counters
    cap check, invalid memory access in IPoIB support, etc.)

 6) tun_get_user() should bail if skb->len is zero, from Alexander
    Potapenko.

 7) Fix RCU lookups in inetpeer, from Eric Dumazet.

 8) Fix locking in packet_do_bund().

 9) Handle cb->start() error properly in netlink dump code, from Jason
    A. Donenfeld.

10) Handle multicast properly in UDP socket early demux code. From Paolo
    Abeni.

11) Several erspan bug fixes in ip_gre, from Xin Long.

12) Fix use-after-free in socket filter code, in order to handle the
    fact that listener lock is no longer taken during the three-way TCP
    handshake. From Eric Dumazet.

13) Fix infoleak in RTM_GETSTATS, from Nikolay Aleksandrov.

14) Fix tail call generation in x86-64 BPF JIT, from Alexei Starovoitov.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (77 commits)
  net: 8021q: skip packets if the vlan is down
  bpf: fix bpf_tail_call() x64 JIT
  net: stmmac: dwmac-rk: Add RK3128 GMAC support
  rndis_host: support Novatel Verizon USB730L
  net: rtnetlink: fix info leak in RTM_GETSTATS call
  socket, bpf: fix possible use after free
  mlxsw: spectrum_router: Track RIF of IPIP next hops
  mlxsw: spectrum_router: Move VRF refcounting
  net: hns3: Fix an error handling path in 'hclge_rss_init_hw()'
  net: mvpp2: Fix clock resource by adding an optional bus clock
  r8152: add Linksys USB3GIGV1 id
  l2tp: fix l2tp_eth module loading
  ip_gre: erspan device should keep dst
  ip_gre: set tunnel hlen properly in erspan_tunnel_init
  ip_gre: check packet length and mtu correctly in erspan_xmit
  ip_gre: get key from session_id correctly in erspan_rcv
  tipc: use only positive error codes in messages
  ppp: fix __percpu annotation
  udp: perform source validation for mcast early demux
  IPv4: early demux can return an error code
  ...
2017-10-05 08:40:09 -07:00
Nicolas Dichtel
6621dd29eb dev: advertise the new nsid when the netns iface changes
x-netns interfaces are bound to two netns: the link netns and the upper
netns. Usually, this kind of interfaces is created in the link netns and
then moved to the upper netns. At the end, the interface is visible only
in the upper netns. The link nsid is advertised via netlink in the upper
netns, thus the user always knows where is the link part.

There is no such mechanism in the link netns. When the interface is moved
to another netns, the user cannot "follow" it.
This patch adds a new netlink attribute which helps to follow an interface
which moves to another netns. When the interface is unregistered, the new
nsid is advertised. If the interface is a x-netns interface (ie
rtnl_link_ops->get_link_net is defined), the nsid is allocated if needed.

CC: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04 18:04:41 -07:00
Alexei Starovoitov
468e2f64d2 bpf: introduce BPF_PROG_QUERY command
introduce BPF_PROG_QUERY command to retrieve a set of either
attached programs to given cgroup or a set of effective programs
that will execute for events within a cgroup

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
for cgroup bits
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04 16:05:05 -07:00
Alexei Starovoitov
324bda9e6c bpf: multi program support for cgroup+bpf
introduce BPF_F_ALLOW_MULTI flag that can be used to attach multiple
bpf programs to a cgroup.

The difference between three possible flags for BPF_PROG_ATTACH command:
- NONE(default): No further bpf programs allowed in the subtree.
- BPF_F_ALLOW_OVERRIDE: If a sub-cgroup installs some bpf program,
  the program in this cgroup yields to sub-cgroup program.
- BPF_F_ALLOW_MULTI: If a sub-cgroup installs some bpf program,
  that cgroup program gets run in addition to the program in this cgroup.

NONE and BPF_F_ALLOW_OVERRIDE existed before. This patch doesn't
change their behavior. It only clarifies the semantics in relation
to new flag.

Only one program is allowed to be attached to a cgroup with
NONE or BPF_F_ALLOW_OVERRIDE flag.
Multiple programs are allowed to be attached to a cgroup with
BPF_F_ALLOW_MULTI flag. They are executed in FIFO order
(those that were attached first, run first)
The programs of sub-cgroup are executed first, then programs of
this cgroup and then programs of parent cgroup.
All eligible programs are executed regardless of return code from
earlier programs.

To allow efficient execution of multiple programs attached to a cgroup
and to avoid penalizing cgroups without any programs attached
introduce 'struct bpf_prog_array' which is RCU protected array
of pointers to bpf programs.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
for cgroup bits
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04 16:05:05 -07:00
Marcelo Ricardo Leitner
ac1ed8b82c sctp: introduce round robin stream scheduler
This patch introduces RFC Draft ndata section 3.2 Priority Based
Scheduler (SCTP_SS_RR).

Works by maintaining a list of enqueued streams and tracking the last
one used to send data. When the datamsg is done, it switches to the next
stream.

See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-13
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 16:27:29 -07:00
Marcelo Ricardo Leitner
637784ade2 sctp: introduce priority based stream scheduler
This patch introduces RFC Draft ndata section 3.4 Priority Based
Scheduler (SCTP_SS_PRIO).

It works by having a struct sctp_stream_priority for each priority
configured. This struct is then enlisted on a queue ordered per priority
if, and only if, there is a stream with data queued, so that dequeueing
is very straightforward: either finish current datamsg or simply dequeue
from the highest priority queued, which is the next stream pointed, and
that's it.

If there are multiple streams assigned with the same priority and with
data queued, it will do round robin amongst them while respecting
datamsgs boundaries (when not using idata chunks), to be reasonably
fair.

We intentionally don't maintain a list of priorities nor a list of all
streams with the same priority to save memory. The first would mean at
least 2 other pointers per priority (which, for 1000 priorities, that
can mean 16kB) and the second would also mean 2 other pointers but per
stream. As SCTP supports up to 65535 streams on a given asoc, that's
1MB. This impacts when giving a priority to some stream, as we have to
find out if the new priority is already being used and if we can free
the old one, and also when tearing down.

The new fields in struct sctp_stream_out_ext and sctp_stream are added
under a union because that memory is to be shared with other schedulers.

See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-13
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 16:27:29 -07:00
Marcelo Ricardo Leitner
0ccdf3c7fd sctp: add sockopt to get/set stream scheduler parameters
As defined per RFC Draft ndata Section 4.3.3, named as
SCTP_STREAM_SCHEDULER_VALUE.

See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-13
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 16:27:29 -07:00
Marcelo Ricardo Leitner
13aa8770fe sctp: add sockopt to get/set stream scheduler
As defined per RFC Draft ndata Section 4.3.2, named as
SCTP_STREAM_SCHEDULER.

See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-13
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 16:27:29 -07:00
Marcelo Ricardo Leitner
5bbbbe32a4 sctp: introduce stream scheduler foundations
This patch introduces the hooks necessary to do stream scheduling, as
per RFC Draft ndata.  It also introduces the first scheduler, which is
what we do today but now factored out: first come first served (FCFS).

With stream scheduling now we have to track which chunk was enqueued on
which stream and be able to select another other than the in front of
the main outqueue. So we introduce a list on sctp_stream_out_ext
structure for this purpose.

We reuse sctp_chunk->transmitted_list space for the list above, as the
chunk cannot belong to the two lists at the same time. By using the
union in there, we can have distinct names for these moments.

sctp_sched_ops are the operations expected to be implemented by each
scheduler. The dequeueing is a bit particular to this implementation but
it is to match how we dequeue packets today. We first dequeue and then
check if it fits the packet and if not, we requeue it at head. Thus why
we don't have a peek operation but have dequeue_done instead, which is
called once the chunk can be safely considered as transmitted.

The check removed from sctp_outq_flush is now performed by
sctp_stream_outq_migrate, which is only called during assoc setup.
(sctp_sendmsg() also checks for it)

The only operation that is foreseen but not yet added here is a way to
signalize that a new packet is starting or that the packet is done, for
round robin scheduler per packet, but is intentionally left to the
patch that actually implements it.

Support for I-DATA chunks, also described in this RFC, with user message
interleaving is straightforward as it just requires the schedulers to
probe for the feature and ignore datamsg boundaries when dequeueing.

See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-13
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 16:27:29 -07:00
Alexei Starovoitov
90caccdd8c bpf: fix bpf_tail_call() x64 JIT
- bpf prog_array just like all other types of bpf array accepts 32-bit index.
  Clarify that in the comment.
- fix x64 JIT of bpf_tail_call which was incorrectly loading 8 instead of 4 bytes
- tighten corresponding check in the interpreter to stay consistent

The JIT bug can be triggered after introduction of BPF_F_NUMA_NODE flag
in commit 96eabe7a40 in 4.14. Before that the map_flags would stay zero and
though JIT code is wrong it will check bounds correctly.
Hence two fixes tags. All other JITs don't have this problem.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Fixes: 96eabe7a40 ("bpf: Allow selecting numa node during map creation")
Fixes: b52f00e6a7 ("x86: bpf_jit: implement bpf_tail_call() helper")
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 16:04:44 -07:00
Linus Torvalds
887c8ba753 USB fixes for 4.14-rc4
Here are a number of USB fixes for 4.14-rc4 to resolved reported issue.
 
 There's a bunch of stuff in here based on the great work Andrey
 Konovalov is doing in fuzzing the USB stack.  Lots of bug fixes when
 dealing with corrupted USB descriptors that we've never seen in "normal"
 operation, but is now ensuring the stack is much more hardened overall.
 
 There's also the usual XHCI and gadget driver fixes as well, and a build
 error fix, and a few other minor things, full details in the shortlog.
 
 All of these have been in linux-next with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWdN/yw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yl6pQCdGY+nPJhzj9EIeFj5QUpSuS4b1pYAoKrbNn+V
 CMpg4iG1oXUtVL8jBbKa
 =fVpl
 -----END PGP SIGNATURE-----

Merge tag 'usb-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are a number of USB fixes for 4.14-rc4 to resolved reported
  issues.

  There's a bunch of stuff in here based on the great work Andrey
  Konovalov is doing in fuzzing the USB stack. Lots of bug fixes when
  dealing with corrupted USB descriptors that we've never seen in
  "normal" operation, but is now ensuring the stack is much more
  hardened overall.

  There's also the usual XHCI and gadget driver fixes as well, and a
  build error fix, and a few other minor things, full details in the
  shortlog.

  All of these have been in linux-next with no reported issues"

* tag 'usb-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (38 commits)
  usb: dwc3: of-simple: Add compatible for Spreadtrum SC9860 platform
  usb: gadget: udc: atmel: set vbus irqflags explicitly
  usb: gadget: ffs: handle I/O completion in-order
  usb: renesas_usbhs: fix usbhsf_fifo_clear() for RX direction
  usb: renesas_usbhs: fix the BCLR setting condition for non-DCP pipe
  usb: gadget: udc: renesas_usb3: Fix return value of usb3_write_pipe()
  usb: gadget: udc: renesas_usb3: fix Pn_RAMMAP.Pn_MPKT value
  usb: gadget: udc: renesas_usb3: fix for no-data control transfer
  USB: dummy-hcd: Fix erroneous synchronization change
  USB: dummy-hcd: fix infinite-loop resubmission bug
  USB: dummy-hcd: fix connection failures (wrong speed)
  USB: cdc-wdm: ignore -EPIPE from GetEncapsulatedResponse
  USB: devio: Don't corrupt user memory
  USB: devio: Prevent integer overflow in proc_do_submiturb()
  USB: g_mass_storage: Fix deadlock when driver is unbound
  USB: gadgetfs: Fix crash caused by inadequate synchronization
  USB: gadgetfs: fix copy_to_user while holding spinlock
  USB: uas: fix bug in handling of alternate settings
  usb-storage: unusual_devs entry to fix write-access regression for Seagate external drives
  usb-storage: fix bogus hardware error messages for ATA pass-thru devices
  ...
2017-10-03 09:25:40 -07:00
Avraham Stern
503c1fb98b cfg80211/nl80211: add a port authorized event
Add an event that indicates that a connection is authorized
(i.e. the 4 way handshake was performed by the driver). This event
should be sent by the driver after sending a connect/roamed event.

This is useful for networks that require 802.1X authentication.
In cases that the driver supports 4 way handshake offload, but the
802.1X authentication is managed by user space, the driver needs to
inform user space right after the 802.11 association was completed
so user space can initialize its 802.1X state machine etc.
However, it is also possible that the AP will choose to skip the
802.1X authentication (e.g. when PMKSA caching is used) and proceed
with the 4 way handshake immediately. In this case the driver needs
to inform user space that 802.1X authentication is no longer required
(e.g. to prevent user space from disconnecting since it did not get
any EAPOLs from the AP).

This is also useful for roaming, in which case it is possible that
the driver used the Fast Transition protocol so 802.1X is not
required.

Since there will now be a dedicated notification indicating that the
connection is authorized, the authorized flag can be removed from the
roamed event. Drivers can send the new port authorized event right
after sending the roamed event to indicate the new AP is already
authorized. This therefore reserves the old PORT_AUTHORIZED attribute.

Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-10-02 14:08:27 +02:00
Maciej Żenczykowski
84e14fe353 net-ipv6: add support for sockopt(SOL_IPV6, IPV6_FREEBIND)
So far we've been relying on sockopt(SOL_IP, IP_FREEBIND) being usable
even on IPv6 sockets.

However, it turns out it is perfectly reasonable to want to set freebind
on an AF_INET6 SOCK_RAW socket - but there is no way to set any SOL_IP
socket option on such a socket (they're all blindly errored out).

One use case for this is to allow spoofing src ip on a raw socket
via sendmsg cmsg.

Tested:
  built, and booted
  # python
  >>> import socket
  >>> SOL_IP = socket.SOL_IP
  >>> SOL_IPV6 = socket.IPPROTO_IPV6
  >>> IP_FREEBIND = 15
  >>> IPV6_FREEBIND = 78
  >>> s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM, 0)
  >>> s.getsockopt(SOL_IP, IP_FREEBIND)
  0
  >>> s.getsockopt(SOL_IPV6, IPV6_FREEBIND)
  0
  >>> s.setsockopt(SOL_IPV6, IPV6_FREEBIND, 1)
  >>> s.getsockopt(SOL_IP, IP_FREEBIND)
  1
  >>> s.getsockopt(SOL_IPV6, IPV6_FREEBIND)
  1

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-30 05:30:52 +01:00
Martin KaFai Lau
ad5b177bd7 bpf: Add map_name to bpf_map_info
This patch allows userspace to specify a name for a map
during BPF_MAP_CREATE.

The map's name can later be exported to user space
via BPF_OBJ_GET_INFO_BY_FD.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-29 06:17:05 +01:00
Martin KaFai Lau
cb4d2b3f03 bpf: Add name, load_time, uid and map_ids to bpf_prog_info
The patch adds name and load_time to struct bpf_prog_aux.  They
are also exported to bpf_prog_info.

The bpf_prog's name is passed by userspace during BPF_PROG_LOAD.
The kernel only stores the first (BPF_PROG_NAME_LEN - 1) bytes
and the name stored in the kernel is always \0 terminated.

The kernel will reject name that contains characters other than
isalnum() and '_'.  It will also reject name that is not null
terminated.

The existing 'user->uid' of the bpf_prog_aux is also exported to
the bpf_prog_info as created_by_uid.

The existing 'used_maps' of the bpf_prog_aux is exported to
the newly added members 'nr_map_ids' and 'map_ids' of
the bpf_prog_info.  On the input, nr_map_ids tells how
big the userspace's map_ids buffer is.  On the output,
nr_map_ids tells the exact user_map_cnt and it will only
copy up to the userspace's map_ids buffer is allowed.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-29 06:17:05 +01:00
Nikolay Aleksandrov
5af48b59f3 net: bridge: add per-port group_fwd_mask with less restrictions
We need to be able to transparently forward most link-local frames via
tunnels (e.g. vxlan, qinq). Currently the bridge's group_fwd_mask has a
mask which restricts the forwarding of STP and LACP, but we need to be able
to forward these over tunnels and control that forwarding on a per-port
basis thus add a new per-port group_fwd_mask option which only disallows
mac pause frames to be forwarded (they're always dropped anyway).
The patch does not change the current default situation - all of the others
are still restricted unless configured for forwarding.
We have successfully tested this patch with LACP and STP forwarding over
VxLAN and qinq tunnels.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-29 06:02:55 +01:00
Daniel Borkmann
de8f3a83b0 bpf: add meta pointer for direct access
This work enables generic transfer of metadata from XDP into skb. The
basic idea is that we can make use of the fact that the resulting skb
must be linear and already comes with a larger headroom for supporting
bpf_xdp_adjust_head(), which mangles xdp->data. Here, we base our work
on a similar principle and introduce a small helper bpf_xdp_adjust_meta()
for adjusting a new pointer called xdp->data_meta. Thus, the packet has
a flexible and programmable room for meta data, followed by the actual
packet data. struct xdp_buff is therefore laid out that we first point
to data_hard_start, then data_meta directly prepended to data followed
by data_end marking the end of packet. bpf_xdp_adjust_head() takes into
account whether we have meta data already prepended and if so, memmove()s
this along with the given offset provided there's enough room.

xdp->data_meta is optional and programs are not required to use it. The
rationale is that when we process the packet in XDP (e.g. as DoS filter),
we can push further meta data along with it for the XDP_PASS case, and
give the guarantee that a clsact ingress BPF program on the same device
can pick this up for further post-processing. Since we work with skb
there, we can also set skb->mark, skb->priority or other skb meta data
out of BPF, thus having this scratch space generic and programmable
allows for more flexibility than defining a direct 1:1 transfer of
potentially new XDP members into skb (it's also more efficient as we
don't need to initialize/handle each of such new members). The facility
also works together with GRO aggregation. The scratch space at the head
of the packet can be multiple of 4 byte up to 32 byte large. Drivers not
yet supporting xdp->data_meta can simply be set up with xdp->data_meta
as xdp->data + 1 as bpf_xdp_adjust_meta() will detect this and bail out,
such that the subsequent match against xdp->data for later access is
guaranteed to fail.

The verifier treats xdp->data_meta/xdp->data the same way as we treat
xdp->data/xdp->data_end pointer comparisons. The requirement for doing
the compare against xdp->data is that it hasn't been modified from it's
original address we got from ctx access. It may have a range marking
already from prior successful xdp->data/xdp->data_end pointer comparisons
though.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 13:36:44 -07:00
Petar Penkov
90e33d4594 tun: enable napi_gro_frags() for TUN/TAP driver
Add a TUN/TAP receive mode that exercises the napi_gro_frags()
interface. This mode is available only in TAP mode, as the interface
expects packets with Ethernet headers.

Furthermore, packets follow the layout of the iovec_iter that was
received. The first iovec is the linear data, and every one after the
first is a fragment. If there are more fragments than the max number,
drop the packet. Additionally, invoke eth_get_headlen() to exercise flow
dissector code and to verify that the header resides in the linear data.

The napi_gro_frags() mode requires setting the IFF_NAPI_FRAGS option.
This is imposed because this mode is intended for testing via tools like
syzkaller and packetdrill, and the increased flexibility it provides can
introduce security vulnerabilities. This flag is accepted only if the
device is in TAP mode and has the IFF_NAPI flag set as well. This is
done because both of these are explicit requirements for correct
operation in this mode.

Signed-off-by: Petar Penkov <peterpenkov96@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: davem@davemloft.net
Cc: ppenkov@stanford.edu
Acked-by: Mahesh Bandewar <maheshb@google,com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-25 20:16:13 -07:00
Petar Penkov
943170998b tun: enable NAPI for TUN/TAP driver
Changes TUN driver to use napi_gro_receive() upon receiving packets
rather than netif_rx_ni(). Adds flag IFF_NAPI that enables these
changes and operation is not affected if the flag is disabled.  SKBs
are constructed upon packet arrival and are queued to be processed
later.

The new path was evaluated with a benchmark with the following setup:
Open two tap devices and a receiver thread that reads in a loop for
each device. Start one sender thread and pin all threads to different
CPUs. Send 1M minimum UDP packets to each device and measure sending
time for each of the sending methods:
	napi_gro_receive():	4.90s
	netif_rx_ni():		4.90s
	netif_receive_skb():	7.20s

Signed-off-by: Petar Penkov <peterpenkov96@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: davem@davemloft.net
Cc: ppenkov@stanford.edu
Acked-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-25 20:16:13 -07:00
Leon Romanovsky
78b1beb099 IB/core: Fix typo in the name of the tag-matching cap struct
The tag matching functionality is implemented by mlx5 driver
by extending XRQ, however this internal kernel information was
exposed to user space applications with *xrq* name instead of *tm*.

This patch renames *xrq* to *tm* to handle that.

Fixes: 8d50505ada ("IB/uverbs: Expose XRQ capabilities")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-25 11:47:23 -04:00
Mikulas Patocka
62e082430e dm ioctl: fix alignment of event number in the device list
The size of struct dm_name_list is different on 32-bit and 64-bit
kernels (so "(nl + 1)" differs between 32-bit and 64-bit kernels).

This mismatch caused some harmless difference in padding when using 32-bit
or 64-bit kernel. Commit 23d70c5e52 ("dm ioctl: report event number in
DM_LIST_DEVICES") added reporting event number in the output of
DM_LIST_DEVICES_CMD. This difference in padding makes it impossible for
userspace to determine the location of the event number (the location
would be different when running on 32-bit and 64-bit kernels).

Fix the padding by using offsetof(struct dm_name_list, name) instead of
sizeof(struct dm_name_list) to determine the location of entries.

Also, the ioctl version number is incremented to 37 so that userspace
can use the version number to determine that the event number is present
and correctly located.

In addition, a global event is now raised when a DM device is created,
removed, renamed or when table is swapped, so that the user can monitor
for device changes.

Reported-by: Eugene Syromiatnikov <esyr@redhat.com>
Fixes: 23d70c5e52 ("dm ioctl: report event number in DM_LIST_DEVICES")
Cc: stable@vger.kernel.org # 4.13
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-09-25 11:18:29 -04:00
Linus Torvalds
71aa60f67f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix NAPI poll list corruption in enic driver, from Christian
    Lamparter.

 2) Fix route use after free, from Eric Dumazet.

 3) Fix regression in reuseaddr handling, from Josef Bacik.

 4) Assert the size of control messages in compat handling since we copy
    it in from userspace twice. From Meng Xu.

 5) SMC layer bug fixes (missing RCU locking, bad refcounting, etc.)
    from Ursula Braun.

 6) Fix races in AF_PACKET fanout handling, from Willem de Bruijn.

 7) Don't use ARRAY_SIZE on spinlock array which might have zero
    entries, from Geert Uytterhoeven.

 8) Fix miscomputation of checksum in ipv6 udp code, from Subash Abhinov
    Kasiviswanathan.

 9) Push the ipv6 header properly in ipv6 GRE tunnel driver, from Xin
    Long.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits)
  inet: fix improper empty comparison
  net: use inet6_rcv_saddr to compare sockets
  net: set tb->fast_sk_family
  net: orphan frags on stand-alone ptype in dev_queue_xmit_nit
  MAINTAINERS: update git tree locations for ieee802154 subsystem
  net: prevent dst uses after free
  net: phy: Fix truncation of large IRQ numbers in phy_attached_print()
  net/smc: no close wait in case of process shut down
  net/smc: introduce a delay
  net/smc: terminate link group if out-of-sync is received
  net/smc: longer delay for client link group removal
  net/smc: adapt send request completion notification
  net/smc: adjust net_device refcount
  net/smc: take RCU read lock for routing cache lookup
  net/smc: add receive timeout check
  net/smc: add missing dev_put
  net: stmmac: Cocci spatch "of_table"
  lan78xx: Use default values loaded from EEPROM/OTP after reset
  lan78xx: Allow EEPROM write for less than MAX_EEPROM_SIZE
  lan78xx: Fix for eeprom read/write when device auto suspend
  ...
2017-09-23 05:41:27 -10:00
Linus Torvalds
c0a3a64e72 Major additions:
- sysctl and seccomp operation to discover available actions. (tyhicks)
 - new per-filter configurable logging infrastructure and sysctl. (tyhicks)
 - SECCOMP_RET_LOG to log allowed syscalls. (tyhicks)
 - SECCOMP_RET_KILL_PROCESS as the new strictest possible action.
 - self-tests for new behaviors.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 Comment: Kees Cook <kees@outflux.net>
 
 iQIcBAABCgAGBQJZxVbTAAoJEIly9N/cbcAmvIAQALR9aVQQXjma4lLhZxwTsLtG
 rJm8t/o4y/2aBV8vzpFbMPT5gfN/PAkHJpCoxVPssx0k4PH2M7HjpnR6E1OC+erg
 RNom3uNdNqZeFlDpdX1qriYiCTB9p6rHe0DPwgG9iGqgDxsJ+G3W+x1sMZ1C+A0M
 shxA3fwt+Qpivo8Zq44xjMFjK+Zeor9V3yPc51QoZktWHlM16ID3HvHVnUtzqAUb
 nTWF6ZlmZlJ/lp4Dq8/55lytVcXPo240G3H0Odai+SNFakK6p5UO//BRBV209bmb
 05jpAOH6uym1sxVz00TQXCtDqOEzs2mQgomtTSShHg8SrLFX7nFkEFtAVA6tEri2
 FqDYce9KX7ZtOYiq83C7pnpAFCouc0z31dQl9USHiAiexXklwBIX+OsVv98omWGi
 pW43uLE2ovY0cpOsN50xI4mnxiGh6MhFcdbor2VLRJwLIFSw3XjjgNCCLyK4AJxs
 N514252qi70c9cWyAHYDLy077yTVxu3JUlsVQKtRTMfoFUq6bX1jPXVXE8qkVrui
 bc/Ay54pPrUwM854IpQ9ZBOuMfs6I5opocGIsBvMaND45U4o2B0ANCsxhuZ0zEtM
 E55DhK5OgjukNemQmlWK2foDckYdtkJXCj2yMBNQady0Uynr2BWZ6VDBP7vFcnRB
 UihRlFZRZleu8383uHsc
 =sKeC
 -----END PGP SIGNATURE-----

Merge tag 'seccomp-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull seccomp updates from Kees Cook:
 "Major additions:

   - sysctl and seccomp operation to discover available actions
     (tyhicks)

   - new per-filter configurable logging infrastructure and sysctl
     (tyhicks)

   - SECCOMP_RET_LOG to log allowed syscalls (tyhicks)

   - SECCOMP_RET_KILL_PROCESS as the new strictest possible action

   - self-tests for new behaviors"

[ This is the seccomp part of the security pull request during the merge
  window that was nixed due to unrelated problems   - Linus ]

* tag 'seccomp-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  samples: Unrename SECCOMP_RET_KILL
  selftests/seccomp: Test thread vs process killing
  seccomp: Implement SECCOMP_RET_KILL_PROCESS action
  seccomp: Introduce SECCOMP_RET_KILL_PROCESS
  seccomp: Rename SECCOMP_RET_KILL to SECCOMP_RET_KILL_THREAD
  seccomp: Action to log before allowing
  seccomp: Filter flag to log all actions except SECCOMP_RET_ALLOW
  seccomp: Selftest for detection of filter flag support
  seccomp: Sysctl to configure actions that are allowed to be logged
  seccomp: Operation for checking if an action is available
  seccomp: Sysctl to display available actions
  seccomp: Provide matching filter for introspection
  selftests/seccomp: Refactor RET_ERRNO tests
  selftests/seccomp: Add simple seccomp overhead benchmark
  selftests/seccomp: Add tests for basic ptrace actions
2017-09-22 16:16:41 -10:00
Florian Fainelli
19cab88726 net: ethtool: Add back transceiver type
Commit 3f1ac7a700 ("net: ethtool: add new ETHTOOL_xLINKSETTINGS API")
deprecated the ethtool_cmd::transceiver field, which was fine in
premise, except that the PHY library was actually using it to report the
type of transceiver: internal or external.

Use the first word of the reserved field to put this __u8 transceiver
field back in. It is made read-only, and we don't expect the
ETHTOOL_xLINKSETTINGS API to be doing anything with this anyway, so this
is mostly for the legacy path where we do:

ethtool_get_settings()
-> dev->ethtool_ops->get_link_ksettings()
   -> convert_link_ksettings_to_legacy_settings()

to have no information loss compared to the legacy get_settings API.

Fixes: 3f1ac7a700 ("net: ethtool: add new ETHTOOL_xLINKSETTINGS API")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-21 15:20:40 -07:00
Emmanuel Grumbach
65026002d6 nl80211: add an option to allow MFP without requiring it
The user space can now allow the kernel to associate to an AP that
requires MFP or that doesn't have MFP enabled in the same
NL80211_CMD_CONNECT command, by using a new NL80211_MFP_OPTIONAL flag.
The driver / firmware will decide whether to use it or not.

Include a feature bit to advertise support for NL80211_MFP_OPTIONAL.
This allows new user space to run on old kernels and know that it
cannot use the new attribute if it isn't supported.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-09-21 11:42:03 +02:00
Roee Zamir
2d23d0736e nl80211: add OCE scan and capability flags
Add Optimized Connectivity Experience (OCE) scan and capability flags.
Some of them unique to OCE and some are stand alone.
And add scan flags to enable/disable them.

Signed-off-by: Roee Zamir <roee.zamir@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-09-21 11:41:59 +02:00
Greg Kroah-Hartman
bd7a3fe770 USB: fix out-of-bounds in usb_set_configuration
Andrey Konovalov reported a possible out-of-bounds problem for a USB interface
association descriptor.  He writes:
	It seems there's no proper size check of a USB_DT_INTERFACE_ASSOCIATION
	descriptor. It's only checked that the size is >= 2 in
	usb_parse_configuration(), so find_iad() might do out-of-bounds access
	to intf_assoc->bInterfaceCount.

And he's right, we don't check for crazy descriptors of this type very well, so
resolve this problem.  Yet another issue found by syzkaller...

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-19 17:27:16 +02:00
Dave Airlie
134dd2e616 Merge tag 'drm-amdkfd-next-2017-09-02' of git://people.freedesktop.org/~gabbayo/linux into drm-fixes
some trivial amdkfd cleanups

* tag 'drm-amdkfd-next-2017-09-02' of git://people.freedesktop.org/~gabbayo/linux:
  drm/amdkfd: pass queue's mqd when destroying mqd
  drm/amdkfd: remove memset before memcpy
  uapi linux/kfd_ioctl.h: only use __u32 and __u64
2017-09-18 16:29:47 +10:00
Linus Torvalds
e7cdb60fd2 Merge branch 'zstd-minimal' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull zstd support from Chris Mason:
 "Nick Terrell's patch series to add zstd support to the kernel has been
  floating around for a while. After talking with Dave Sterba, Herbert
  and Phillip, we decided to send the whole thing in as one pull
  request.

  zstd is a big win in speed over zlib and in compression ratio over
  lzo, and the compression team here at FB has gotten great results
  using it in production. Nick will continue to update the kernel side
  with new improvements from the open source zstd userland code.

  Nick has a number of benchmarks for the main zstd code in his lib/zstd
  commit:

      I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB
      of RAM. The VM is running on a MacBook Pro with a 3.1 GHz Intel
      Core i7 processor, 16 GB of RAM, and a SSD. I benchmarked using
      `silesia.tar` [3], which is 211,988,480 B large. Run the following
      commands for the benchmark:

        sudo modprobe zstd_compress_test
        sudo mknod zstd_compress_test c 245 0
        sudo cp silesia.tar zstd_compress_test

      The time is reported by the time of the userland `cp`.
      The MB/s is computed with

        1,536,217,008 B / time(buffer size, hash)

      which includes the time to copy from userland.
      The Adjusted MB/s is computed with

        1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)).

      The memory reported is the amount of memory the compressor
      requests.

        | Method   | Size (B) | Time (s) | Ratio | MB/s    | Adj MB/s | Mem (MB) |
        |----------|----------|----------|-------|---------|----------|----------|
        | none     | 11988480 |    0.100 |     1 | 2119.88 |        - |        - |
        | zstd -1  | 73645762 |    1.044 | 2.878 |  203.05 |   224.56 |     1.23 |
        | zstd -3  | 66988878 |    1.761 | 3.165 |  120.38 |   127.63 |     2.47 |
        | zstd -5  | 65001259 |    2.563 | 3.261 |   82.71 |    86.07 |     2.86 |
        | zstd -10 | 60165346 |   13.242 | 3.523 |   16.01 |    16.13 |    13.22 |
        | zstd -15 | 58009756 |   47.601 | 3.654 |    4.45 |     4.46 |    21.61 |
        | zstd -19 | 54014593 |  102.835 | 3.925 |    2.06 |     2.06 |    60.15 |
        | zlib -1  | 77260026 |    2.895 | 2.744 |   73.23 |    75.85 |     0.27 |
        | zlib -3  | 72972206 |    4.116 | 2.905 |   51.50 |    52.79 |     0.27 |
        | zlib -6  | 68190360 |    9.633 | 3.109 |   22.01 |    22.24 |     0.27 |
        | zlib -9  | 67613382 |   22.554 | 3.135 |    9.40 |     9.44 |     0.27 |

      I benchmarked zstd decompression using the same method on the same
      machine. The benchmark file is located in the upstream zstd repo
      under `contrib/linux-kernel/zstd_decompress_test.c` [4]. The
      memory reported is the amount of memory required to decompress
      data compressed with the given compression level. If you know the
      maximum size of your input, you can reduce the memory usage of
      decompression irrespective of the compression level.

        | Method   | Time (s) | MB/s    | Adjusted MB/s | Memory (MB) |
        |----------|----------|---------|---------------|-------------|
        | none     |    0.025 | 8479.54 |             - |           - |
        | zstd -1  |    0.358 |  592.15 |        636.60 |        0.84 |
        | zstd -3  |    0.396 |  535.32 |        571.40 |        1.46 |
        | zstd -5  |    0.396 |  535.32 |        571.40 |        1.46 |
        | zstd -10 |    0.374 |  566.81 |        607.42 |        2.51 |
        | zstd -15 |    0.379 |  559.34 |        598.84 |        4.61 |
        | zstd -19 |    0.412 |  514.54 |        547.77 |        8.80 |
        | zlib -1  |    0.940 |  225.52 |        231.68 |        0.04 |
        | zlib -3  |    0.883 |  240.08 |        247.07 |        0.04 |
        | zlib -6  |    0.844 |  251.17 |        258.84 |        0.04 |
        | zlib -9  |    0.837 |  253.27 |        287.64 |        0.04 |

  I ran a long series of tests and benchmarks on the btrfs side and the
  gains are very similar to the core benchmarks Nick ran"

* 'zstd-minimal' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  squashfs: Add zstd support
  btrfs: Add zstd support
  lib: Add zstd modules
  lib: Add xxhash module
2017-09-14 17:30:49 -07:00
Linus Torvalds
dd198ce714 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull namespace updates from Eric Biederman:
 "Life has been busy and I have not gotten half as much done this round
  as I would have liked. I delayed it so that a minor conflict
  resolution with the mips tree could spend a little time in linux-next
  before I sent this pull request.

  This includes two long delayed user namespace changes from Kirill
  Tkhai. It also includes a very useful change from Serge Hallyn that
  allows the security capability attribute to be used inside of user
  namespaces. The practical effect of this is people can now untar
  tarballs and install rpms in user namespaces. It had been suggested to
  generalize this and encode some of the namespace information
  information in the xattr name. Upon close inspection that makes the
  things that should be hard easy and the things that should be easy
  more expensive.

  Then there is my bugfix/cleanup for signal injection that removes the
  magic encoding of the siginfo union member from the kernel internal
  si_code. The mips folks reported the case where I had used FPE_FIXME
  me is impossible so I have remove FPE_FIXME from mips, while at the
  same time including a return statement in that case to keep gcc from
  complaining about unitialized variables.

  I almost finished the work to get make copy_siginfo_to_user a trivial
  copy to user. The code is available at:

     git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git neuter-copy_siginfo_to_user-v3

  But I did not have time/energy to get the code posted and reviewed
  before the merge window opened.

  I was able to see that the security excuse for just copying fields
  that we know are initialized doesn't work in practice there are buggy
  initializations that don't initialize the proper fields in siginfo. So
  we still sometimes copy unitialized data to userspace"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  Introduce v3 namespaced file capabilities
  mips/signal: In force_fcr31_sig return in the impossible case
  signal: Remove kernel interal si_code magic
  fcntl: Don't use ambiguous SIG_POLL si_codes
  prctl: Allow local CAP_SYS_ADMIN changing exe_file
  security: Use user_namespace::level to avoid redundant iterations in cap_capable()
  userns,pidns: Verify the userns for new pid namespaces
  signal/testing: Don't look for __SI_FAULT in userspace
  signal/mips: Document a conflict with SI_USER with SIGFPE
  signal/sparc: Document a conflict with SI_USER with SIGFPE
  signal/ia64: Document a conflict with SI_USER with SIGFPE
  signal/alpha: Document a conflict with SI_USER for SIGTRAP
2017-09-11 18:34:47 -07:00
Linus Torvalds
ae46654bcf ARM: SoC driver updates for v4.14
This branch contains platform-related driver updates for ARM and ARM64.
 
 Among them:
 
  - Reset driver updates:
   + New API for dealing with arrays of resets
   + Make unimplemented {de,}assert return success on shared resets
   + MSDKv1 driver
   + Removal of obsolete Gemini reset driver
   + Misc updates for sunxi and Uniphier
 
  - SoC drivers:
   + Platform SoC driver registration on Tegra
   + Shuffle of Qualcomm drivers into a submenu
   + Allwinner A64 support for SRAM
   + Renesas R-Car R3 support
   + Power domains for Rockchip RK3366
 
  - Misc updates and smaller fixes for TEE and memory driver subsystems
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJZtdt7AAoJEIwa5zzehBx3EboP/jR2T9lrMavXR1zL48L14yJb
 S+fiJlrX1Kr42UF4PQvsfs+uTqOLmycrPVFkMb6IwoUPzQ9UCOSoiMzYm2b7ZPvt
 uIesHhdM3/xun6wKfieN8GmNA1yDVynTxo0TTYDw5ha7I6s2GHgw0GSFzy3wm0Qg
 KzerAO3gzf3L5XsKR0cai3IXNjHO9ubpFG1ReR09da28nPElP8ggWg0KnqdO76Ch
 BGpFj78EC875ZNqwHgnspUqgGDJnBjig3m/uA4FWA0G9Jl38tCyKTZfUR7cEraoV
 kyCgBlR/UrI8eXVTyEy5k5iTsQ3A1VhX4rGjyH+5NZHTs1yWr4+RDND/qeGl9tSo
 VASuOtH6Rc3vdUDpHPBNAFNQH8fwwDoKf96dvN1tiffsx6LSKb//NyOfkXzKOtR6
 CP5raYfX4YktLtHq0XVTZ/6r3XmLcTHzElR/dCFpQOFcTOYii0pWtfcWouahbZ1w
 dhoBX/dbNq37MfzrxtHN2VTIEHpn2GU7u+ZGkp2ArokD58BAft/M3Xee1cDnF75g
 ZDwe5eNFT8aBZKaY7zwG8cdxiw9kACAivDRwW+zgpfUr39c+d0+QmVfnfJh4EGXK
 Ri6yr2EfBWK6jw3cwkdSyyt7iSzIkB+RiuuD1MjpYhWzAvoDpzqkXYukFGbpXnuy
 vUFHNuP1ocUsRtCs8mm1
 =yBzS
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC driver updates from Olof Johansson:
 "This branch contains platform-related driver updates for ARM and ARM64.

  Among them:

   - Reset driver updates:
     + New API for dealing with arrays of resets
     + Make unimplemented {de,}assert return success on shared resets
     + MSDKv1 driver
     + Removal of obsolete Gemini reset driver
     + Misc updates for sunxi and Uniphier

   - SoC drivers:
     + Platform SoC driver registration on Tegra
     + Shuffle of Qualcomm drivers into a submenu
     + Allwinner A64 support for SRAM
     + Renesas R-Car R3 support
     + Power domains for Rockchip RK3366

   - Misc updates and smaller fixes for TEE and memory driver
     subsystems"

* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (54 commits)
  firmware: arm_scpi: fix endianness of dev_id in struct dev_pstate_set
  soc/tegra: fuse: Add missing semi-colon
  soc/tegra: Restrict SoC device registration to Tegra
  drivers: soc: sunxi: add support for A64 and its SRAM C
  drivers: soc: sunxi: add support for remapping func value to reg value
  drivers: soc: sunxi: fix error processing on base address when claiming
  dt-bindings: add binding for Allwinner A64 SRAM controller and SRAM C
  bus: sunxi-rsb: Enable by default for ARM64
  soc/tegra: Register SoC device
  firmware: tegra: set drvdata earlier
  memory: Convert to using %pOF instead of full_name
  soc: Convert to using %pOF instead of full_name
  bus: Convert to using %pOF instead of full_name
  firmware: Convert to using %pOF instead of full_name
  soc: mediatek: add SCPSYS power domain driver for MediaTek MT7622 SoC
  soc: mediatek: add header files required for MT7622 SCPSYS dt-binding
  soc: mediatek: reduce code duplication of scpsys_probe across all SoCs
  dt-bindings: soc: update the binding document for SCPSYS on MediaTek MT7622 SoC
  reset: uniphier: add analog amplifiers reset control
  reset: uniphier: add video input subsystem reset control
  ...
2017-09-10 20:40:00 -07:00
Linus Torvalds
126e76ffbf Merge branch 'for-4.14/block-postmerge' of git://git.kernel.dk/linux-block
Pull followup block layer updates from Jens Axboe:
 "I ended up splitting the main pull request for this series into two,
  mainly because of clashes between NVMe fixes that went into 4.13 after
  the for-4.14 branches were split off. This pull request is mostly
  NVMe, but not exclusively. In detail, it contains:

   - Two pull request for NVMe changes from Christoph. Nothing new on
     the feature front, basically just fixes all over the map for the
     core bits, transport, rdma, etc.

   - Series from Bart, cleaning up various bits in the BFQ scheduler.

   - Series of bcache fixes, which has been lingering for a release or
     two. Coly sent this in, but patches from various people in this
     area.

   - Set of patches for BFQ from Paolo himself, updating both
     documentation and fixing some corner cases in performance.

   - Series from Omar, attempting to now get the 4k loop support
     correct. Our confidence level is higher this time.

   - Series from Shaohua for loop as well, improving O_DIRECT
     performance and fixing a use-after-free"

* 'for-4.14/block-postmerge' of git://git.kernel.dk/linux-block: (74 commits)
  bcache: initialize dirty stripes in flash_dev_run()
  loop: set physical block size to logical block size
  bcache: fix bch_hprint crash and improve output
  bcache: Update continue_at() documentation
  bcache: silence static checker warning
  bcache: fix for gc and write-back race
  bcache: increase the number of open buckets
  bcache: Correct return value for sysfs attach errors
  bcache: correct cache_dirty_target in __update_writeback_rate()
  bcache: gc does not work when triggering by manual command
  bcache: Don't reinvent the wheel but use existing llist API
  bcache: do not subtract sectors_to_gc for bypassed IO
  bcache: fix sequential large write IO bypass
  bcache: Fix leak of bdev reference
  block/loop: remove unused field
  block/loop: fix use after free
  bfq: Use icq_to_bic() consistently
  bfq: Suppress compiler warnings about comparisons
  bfq: Check kstrtoul() return value
  bfq: Declare local functions static
  ...
2017-09-09 12:49:01 -07:00
Linus Torvalds
fbd01410e8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "The iwlwifi firmware compat fix is in here as well as some other
  stuff:

  1) Fix request socket leak introduced by BPF deadlock fix, from Eric
     Dumazet.

  2) Fix VLAN handling with TXQs in mac80211, from Johannes Berg.

  3) Missing __qdisc_drop conversions in prio and qfq schedulers, from
     Gao Feng.

  4) Use after free in netlink nlk groups handling, from Xin Long.

  5) Handle MTU update properly in ipv6 gre tunnels, from Xin Long.

  6) Fix leak of ipv6 fib tables on netns teardown, from Sabrina Dubroca
     with follow-on fix from Eric Dumazet.

  7) Need RCU and preemption disabled during generic XDP data patch,
     from John Fastabend"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (54 commits)
  bpf: make error reporting in bpf_warn_invalid_xdp_action more clear
  Revert "mdio_bus: Remove unneeded gpiod NULL check"
  bpf: devmap, use cond_resched instead of cpu_relax
  bpf: add support for sockmap detach programs
  net: rcu lock and preempt disable missing around generic xdp
  bpf: don't select potentially stale ri->map from buggy xdp progs
  net: tulip: Constify tulip_tbl
  net: ethernet: ti: netcp_core: no need in netif_napi_del
  davicom: Display proper debug level up to 6
  net: phy: sfp: rename dt properties to match the binding
  dt-binding: net: sfp binding documentation
  dt-bindings: add SFF vendor prefix
  dt-bindings: net: don't confuse with generic PHY property
  ip6_tunnel: fix setting hop_limit value for ipv6 tunnel
  ip_tunnel: fix setting ttl and tos value in collect_md mode
  ipv6: fix typo in fib6_net_exit()
  tcp: fix a request socket leak
  sctp: fix missing wake ups in some situations
  netfilter: xt_hashlimit: fix build error caused by 64bit division
  netfilter: xt_hashlimit: alloc hashtable with right size
  ...
2017-09-09 11:05:20 -07:00