kernel_optimize_test/lib
Ross Zwisler 9f418224e8 radix tree: fix multi-order iteration race
Fix a race in the multi-order iteration code which causes the kernel to
hit a GP fault.  This was first seen with a production v4.15 based
kernel (4.15.6-300.fc27.x86_64) utilizing a DAX workload which used
order 9 PMD DAX entries.

The race has to do with how we tear down multi-order sibling entries
when we are removing an item from the tree.  Remember for example that
an order 2 entry looks like this:

  struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]

where 'entry' is in some slot in the struct radix_tree_node, and the
three slots following 'entry' contain sibling pointers which point back
to 'entry.'

When we delete 'entry' from the tree, we call :

  radix_tree_delete()
    radix_tree_delete_item()
      __radix_tree_delete()
        replace_slot()

replace_slot() first removes the siblings in order from the first to the
last, then at then replaces 'entry' with NULL.  This means that for a
brief period of time we end up with one or more of the siblings removed,
so:

  struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]

This causes an issue if you have a reader iterating over the slots in
the tree via radix_tree_for_each_slot() while only under
rcu_read_lock()/rcu_read_unlock() protection.  This is a common case in
mm/filemap.c.

The issue is that when __radix_tree_next_slot() => skip_siblings() tries
to skip over the sibling entries in the slots, it currently does so with
an exact match on the slot directly preceding our current slot.
Normally this works:

                                      V preceding slot
  struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
                                              ^ current slot

This lets you find the first sibling, and you skip them all in order.

But in the case where one of the siblings is NULL, that slot is skipped
and then our sibling detection is interrupted:

                                             V preceding slot
  struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
                                                    ^ current slot

This means that the sibling pointers aren't recognized since they point
all the way back to 'entry', so we think that they are normal internal
radix tree pointers.  This causes us to think we need to walk down to a
struct radix_tree_node starting at the address of 'entry'.

In a real running kernel this will crash the thread with a GP fault when
you try and dereference the slots in your broken node starting at
'entry'.

We fix this race by fixing the way that skip_siblings() detects sibling
nodes.  Instead of testing against the preceding slot we instead look
for siblings via is_sibling_entry() which compares against the position
of the struct radix_tree_node.slots[] array.  This ensures that sibling
entries are properly identified, even if they are no longer contiguous
with the 'entry' they point to.

Link: http://lkml.kernel.org/r/20180503192430.7582-6-ross.zwisler@linux.intel.com
Fixes: 148deab223 ("radix-tree: improve multiorder iterators")
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: CR, Sapthagirish <sapthagirish.cr@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-18 17:17:12 -07:00
..
842
fonts
lz4
lzo
mpi
raid6 powerpc updates for 4.17 2018-04-07 12:08:19 -07:00
reed_solomon
xz
zlib_deflate
zlib_inflate
zstd
.gitignore
argv_split.c
ashldi3.c
ashrdi3.c
asn1_decoder.c
assoc_array.c
atomic64_test.c
atomic64.c
audit.c
bcd.c
bch.c
bitmap.c lib: fix stall in __bitmap_parselist() 2018-04-05 21:36:21 -07:00
bitrev.c
bsearch.c
btree.c
bucket_locks.c
bug.c
build_OID_registry
bust_spinlocks.c
chacha20.c
check_signature.c
checksum.c
clz_ctz.c
clz_tab.c
cmdline.c
cmpdi2.c
compat_audit.c
cordic.c
cpu_rmap.c
cpumask.c
crc-ccitt.c
crc-itu-t.c
crc-t10dif.c
crc4.c
crc7.c
crc8.c
crc16.c
crc32.c
crc32defs.h
crc32test.c
ctype.c
debug_info.c
debug_locks.c
debugobjects.c
dec_and_lock.c
decompress_bunzip2.c
decompress_inflate.c
decompress_unlz4.c
decompress_unlzma.c
decompress_unlzo.c
decompress_unxz.c
decompress.c
devres.c
digsig.c
div64.c
dma-debug.c
dma-direct.c dma-direct: don't retry allocation for no-op GFP_DMA 2018-04-23 14:43:27 +02:00
dma-virt.c
dump_stack.c
dynamic_debug.c
dynamic_queue_limits.c
earlycpio.c
error-inject.c
errseq.c errseq: Always report a writeback error once 2018-04-27 08:51:26 -04:00
extable.c
fault-inject.c
fdt_empty_tree.c
fdt_ro.c
fdt_rw.c
fdt_strerror.c
fdt_sw.c
fdt_wip.c
fdt.c
find_bit_benchmark.c lib/find_bit_benchmark.c: avoid soft lockup in test_find_first_bit() 2018-05-11 17:28:45 -07:00
find_bit.c
flex_array.c
flex_proportions.c
gcd.c
gen_crc32table.c
genalloc.c
glob.c
globtest.c
hexdump.c
hweight.c
idr.c
inflate.c
int_sqrt.c
interval_tree_test.c
interval_tree.c
iomap_copy.c
iomap.c
iommu-common.c
iommu-helper.c
ioremap.c
iov_iter.c
irq_poll.c
irq_regs.c
is_single_threaded.c
jedec_ddr_data.c
kasprintf.c
Kconfig
Kconfig.debug lib/Kconfig.debug: Debug Lockups and Hangs: keep SOFTLOCKUP options together 2018-04-11 10:28:35 -07:00
Kconfig.kasan
Kconfig.kgdb
Kconfig.ubsan lib: add testing module for UBSAN 2018-04-11 10:28:35 -07:00
kfifo.c
klist.c
kobject_uevent.c
kobject.c kobject: don't use WARN for registration failures 2018-04-23 13:14:55 +02:00
kstrtox.c
kstrtox.h
lcm.c
libcrc32c.c
list_debug.c lib/list_debug.c: print unmangled addresses 2018-04-11 10:28:35 -07:00
list_sort.c
llist.c
locking-selftest-hardirq.h
locking-selftest-mutex.h
locking-selftest-rlock-hardirq.h
locking-selftest-rlock-softirq.h
locking-selftest-rlock.h
locking-selftest-rsem.h
locking-selftest-rtmutex.h
locking-selftest-softirq.h
locking-selftest-spin-hardirq.h
locking-selftest-spin-softirq.h
locking-selftest-spin.h
locking-selftest-wlock-hardirq.h
locking-selftest-wlock-softirq.h
locking-selftest-wlock.h
locking-selftest-wsem.h
locking-selftest.c
lockref.c lockref: Add lockref_put_not_zero 2018-04-12 09:41:19 -07:00
logic_pio.c
lru_cache.c
lshrdi3.c
Makefile lib: add testing module for UBSAN 2018-04-11 10:28:35 -07:00
memory-notifier-error-inject.c
memweight.c
muldi3.c
net_utils.c
netdev-notifier-error-inject.c
nlattr.c
nmi_backtrace.c
nodemask.c
notifier-error-inject.c
notifier-error-inject.h
of-reconfig-notifier-error-inject.c
oid_registry.c
once.c
parman.c
parser.c
pci_iomap.c
percpu_counter.c
percpu_ida.c
percpu_test.c
percpu-refcount.c
plist.c
pm-notifier-error-inject.c
prime_numbers.c
radix-tree.c radix tree: fix multi-order iteration race 2018-05-18 17:17:12 -07:00
random32.c
ratelimit.c
rational.c
rbtree_test.c
rbtree.c
reciprocal_div.c
refcount.c
rhashtable.c
sbitmap.c
scatterlist.c
seq_buf.c
sg_pool.c
sg_split.c
sha1.c
sha256.c kernel/kexec_file.c: move purgatories sha256 to common code 2018-04-13 17:10:28 -07:00
show_mem.c
siphash.c
smp_processor_id.c
sort.c
stackdepot.c
stmp_device.c
string_helpers.c
string.c
strncpy_from_user.c
strnlen_user.c
swiotlb.c swiotlb: silent unwanted warning "buffer is full" 2018-05-12 11:57:37 +02:00
syscall.c
test_bitmap.c lib/test_bitmap.c: fix bitmap optimisation tests to report errors correctly 2018-05-18 17:17:12 -07:00
test_bpf.c
test_debug_virtual.c
test_firmware.c headers: untangle kmemleak.h from mm.h 2018-04-05 21:36:27 -07:00
test_hash.c
test_hexdump.c
test_kasan.c kasan: fix invalid-free test crashing the kernel 2018-04-11 10:28:32 -07:00
test_kmod.c
test_list_sort.c
test_module.c
test_parman.c
test_printf.c
test_rhashtable.c
test_siphash.c
test_sort.c
test_static_key_base.c
test_static_keys.c
test_string.c
test_sysctl.c
test_ubsan.c lib/test_ubsan.c: make test_ubsan_misaligned_access() static 2018-04-11 10:28:35 -07:00
test_user_copy.c
test_uuid.c
test-kstrtox.c
test-string_helpers.c
textsearch.c textsearch: fix kernel-doc warnings and add kernel-api section 2018-04-16 18:53:13 -04:00
timerqueue.c
ts_bm.c
ts_fsm.c
ts_kmp.c
ubsan.c
ubsan.h
ucmpdi2.c
ucs2_string.c
usercopy.c
uuid.c
vsprintf.c vsprintf: Replace memory barrier with static_key for random_ptr_key update 2018-05-16 09:01:41 -04:00
win_minmax.c
xxhash.c