Go to file
Vivek Goyal 9eaa10be0c dax: Wake up all waiters after invalidating dax entry
[ Upstream commit 237388320deffde7c2d65ed8fc9eef670dc979b3 ]

I am seeing missed wakeups which ultimately lead to a deadlock when I am
using virtiofs with DAX enabled and running "make -j". I had to mount
virtiofs as rootfs and also reduce to dax window size to 256M to reproduce
the problem consistently.

So here is the problem. put_unlocked_entry() wakes up waiters only
if entry is not null as well as !dax_is_conflict(entry). But if I
call multiple instances of invalidate_inode_pages2() in parallel,
then I can run into a situation where there are waiters on
this index but nobody will wake these waiters.

invalidate_inode_pages2()
  invalidate_inode_pages2_range()
    invalidate_exceptional_entry2()
      dax_invalidate_mapping_entry_sync()
        __dax_invalidate_entry() {
                xas_lock_irq(&xas);
                entry = get_unlocked_entry(&xas, 0);
                ...
                ...
                dax_disassociate_entry(entry, mapping, trunc);
                xas_store(&xas, NULL);
                ...
                ...
                put_unlocked_entry(&xas, entry);
                xas_unlock_irq(&xas);
        }

Say a fault in in progress and it has locked entry at offset say "0x1c".
Now say three instances of invalidate_inode_pages2() are in progress
(A, B, C) and they all try to invalidate entry at offset "0x1c". Given
dax entry is locked, all tree instances A, B, C will wait in wait queue.

When dax fault finishes, say A is woken up. It will store NULL entry
at index "0x1c" and wake up B. When B comes along it will find "entry=0"
at page offset 0x1c and it will call put_unlocked_entry(&xas, 0). And
this means put_unlocked_entry() will not wake up next waiter, given
the current code. And that means C continues to wait and is not woken
up.

This patch fixes the issue by waking up all waiters when a dax entry
has been invalidated. This seems to fix the deadlock I am facing
and I can make forward progress.

Reported-by: Sergio Lopez <slp@redhat.com>
Fixes: ac401cc782 ("dax: New fault locking")
Reviewed-by: Jan Kara <jack@suse.cz>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Link: https://lore.kernel.org/r/20210428190314.1865312-4-vgoyal@redhat.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19 10:13:12 +02:00
arch KVM: x86: Prevent deadlock against tk_core.seq 2021-05-19 10:13:12 +02:00
block blk-iocost: fix weight updates of inner active iocgs 2021-05-19 10:13:11 +02:00
certs certs: Fix blacklist flag type confusion 2021-03-04 11:37:59 +01:00
crypto async_xor: increase src_offs when dropping destination page 2021-05-14 09:49:59 +02:00
Documentation kbuild: generate Module.symvers only when vmlinux exists 2021-05-19 10:12:59 +02:00
drivers drm/msm/dp: initialize audio_comp when audio starts 2021-05-19 10:13:12 +02:00
fs dax: Wake up all waiters after invalidating dax entry 2021-05-19 10:13:12 +02:00
include mm/hugetlb: fix F_SEAL_FUTURE_WRITE 2021-05-19 10:13:11 +02:00
init seccomp: Fix CONFIG tests for Seccomp_filters 2021-05-14 09:50:24 +02:00
ipc ipc: adjust proc_ipc_sem_dointvec definition to match prototype 2020-09-05 12:14:29 -07:00
kernel kernel/resource: make walk_mem_res() find all busy IORESOURCE_MEM resources 2021-05-19 10:13:09 +02:00
lib kasan: fix unit tests with CONFIG_UBSAN_LOCAL_BOUNDS enabled 2021-05-19 10:13:11 +02:00
LICENSES LICENSES/deprecated: add Zlib license text 2020-09-16 14:33:49 +02:00
mm mm/hugetlb: fix F_SEAL_FUTURE_WRITE 2021-05-19 10:13:11 +02:00
net mptcp: fix splat when closing unaccepted socket 2021-05-19 10:13:10 +02:00
samples samples/bpf: Fix broken tracex1 due to kprobe argument change 2021-05-19 10:12:57 +02:00
scripts kbuild: generate Module.symvers only when vmlinux exists 2021-05-19 10:12:59 +02:00
security KEYS: trusted: Fix memory leak on object td 2021-05-19 10:12:50 +02:00
sound ASoC: rt286: Make RT286_SET_GPIO_* readable and writable 2021-05-19 10:13:00 +02:00
tools libbpf: Fix signed overflow in ringbuf_process_ring 2021-05-19 10:13:06 +02:00
usr Merge branch 'work.fdpic' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 13:29:39 -07:00
virt kvm: exit halt polling on need_resched() as well 2021-05-19 10:13:12 +02:00
.clang-format RDMA 5.10 pull request 2020-10-17 11:18:18 -07:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore kbuild: generate Module.symvers only when vmlinux exists 2021-05-19 10:12:59 +02:00
.mailmap mailmap: add two more addresses of Uwe Kleine-König 2020-12-06 10:19:07 -08:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Move Jason Cooper to CREDITS 2020-11-30 10:20:34 +01:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS f2fs: move ioctl interface definitions to separated file 2021-05-19 10:13:00 +02:00
Makefile kbuild: generate Module.symvers only when vmlinux exists 2021-05-19 10:12:59 +02:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.