kernel_optimize_test/include
Parav Pandit 01b671170d RDMA/core: Sync unregistration with netlink commands
When the rdma device is getting removed, get resource info can race with
device removal, as below:

      CPU-0                                  CPU-1
    --------                               --------
    rdma_nl_rcv_msg()
       nldev_res_get_cq_dumpit()
          mutex_lock(device_lock);
          get device reference
          mutex_unlock(device_lock);        [..]
                                            ib_unregister_device()
                                            /* Valid reference to
                                             * device->dev exists.
                                             */
                                             ib_dealloc_device()

          [..]
          provider->fill_res_entry();

Even though device object is not freed, fill_res_entry() can get called on
device which doesn't have a driver anymore. Kernel core device reference
count is not sufficient, as this only keeps the structure valid, and
doesn't guarantee the driver is still loaded.

Similar race can occur with device renaming and device removal, where
device_rename() tries to rename a unregistered device. While this is fine
for devices of a class which are not net namespace aware, but it is
incorrect for net namespace aware class coming in subsequent series.  If a
class is net namespace aware, then the below [1] call trace is observed in
above situation.

Therefore, to avoid the race, keep a reference count and let device
unregistration wait until all netlink users drop the reference.

[1] Call trace:
kernfs: ns required in 'infiniband' for 'mlx5_0'
WARNING: CPU: 18 PID: 44270 at fs/kernfs/dir.c:842 kernfs_find_ns+0x104/0x120
libahci i2c_core mlxfw libata dca [last unloaded: devlink]
RIP: 0010:kernfs_find_ns+0x104/0x120
Call Trace:
kernfs_find_and_get_ns+0x2e/0x50
sysfs_rename_link_ns+0x40/0xb0
device_rename+0xb2/0xf0
ib_device_rename+0xb3/0x100 [ib_core]
nldev_set_doit+0x165/0x190 [ib_core]
rdma_nl_rcv_msg+0x249/0x250 [ib_core]
? netlink_deliver_tap+0x8f/0x3e0
rdma_nl_rcv+0xd6/0x120 [ib_core]
netlink_unicast+0x17c/0x230
netlink_sendmsg+0x2f0/0x3e0
sock_sendmsg+0x30/0x40
__sys_sendto+0xdc/0x160

Fixes: da5c850782 ("RDMA/nldev: add driver-specific resource tracking")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-22 12:39:26 -07:00
..
acpi
asm-generic percpu: remove PER_CPU_DEF_ATTRIBUTES macro 2018-10-31 08:54:14 -07:00
clocksource
crypto
drm drm, i915, amdgpu, bridge + core quirk 2018-11-02 10:58:20 -07:00
dt-bindings This time it looks like a quieter release cycle in the clk tree. I guess that's 2018-10-31 11:08:30 -07:00
keys
kvm
linux {net,IB}/mlx5: Move Page fault EQ and ODP logic to RDMA 2018-11-20 20:07:33 +02:00
math-emu
media media updates for v4.20-rc1 2018-10-31 10:53:29 -07:00
memory
misc
net net: drop a space before tabs 2018-10-31 12:37:12 -07:00
pcmcia
ras
rdma RDMA/core: Sync unregistration with netlink commands 2018-11-22 12:39:26 -07:00
scsi
soc ARM: SoC driver updates for 4.17 2018-10-29 15:16:01 -07:00
sound
target
trace Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-11-01 19:58:52 -07:00
uapi Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-11-03 18:13:43 -07:00
video
xen