kernel_optimize_test/fs
Dave Chinner 14b494b7aa xfs: Enforce attr3 buffer recovery order
commit d8f4c2d0398fa1d92cacf854daf80d21a46bfefc upstream.

>From the department of "WTAF? How did we miss that!?"...

When we are recovering a buffer, the first thing we do is check the
buffer magic number and extract the LSN from the buffer. If the LSN
is older than the current LSN, we replay the modification to it. If
the metadata on disk is newer than the transaction in the log, we
skip it. This is a fundamental v5 filesystem metadata recovery
behaviour.

generic/482 failed with an attribute writeback failure during log
recovery. The write verifier caught the corruption before it got
written to disk, and the attr buffer dump looked like:

XFS (dm-3): Metadata corruption detected at xfs_attr3_leaf_verify+0x275/0x2e0, xfs_attr3_leaf block 0x19be8
XFS (dm-3): Unmount and run xfs_repair
XFS (dm-3): First 128 bytes of corrupted metadata buffer:
00000000: 00 00 00 00 00 00 00 00 3b ee 00 00 4d 2a 01 e1  ........;...M*..
00000010: 00 00 00 00 00 01 9b e8 00 00 00 01 00 00 05 38  ...............8
                                  ^^^^^^^^^^^^^^^^^^^^^^^
00000020: df 39 5e 51 58 ac 44 b6 8d c5 e7 10 44 09 bc 17  .9^QX.D.....D...
00000030: 00 00 00 00 00 02 00 83 00 03 00 cc 0f 24 01 00  .............$..
00000040: 00 68 0e bc 0f c8 00 10 00 00 00 00 00 00 00 00  .h..............
00000050: 00 00 3c 31 0f 24 01 00 00 00 3c 32 0f 88 01 00  ..<1.$....<2....
00000060: 00 00 3c 33 0f d8 01 00 00 00 00 00 00 00 00 00  ..<3............
00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.....

The highlighted bytes are the LSN that was replayed into the
buffer: 0x100000538. This is cycle 1, block 0x538. Prior to replay,
that block on disk looks like this:

$ sudo xfs_db -c "fsb 0x417d" -c "type attr3" -c p /dev/mapper/thin-vol
hdr.info.hdr.forw = 0
hdr.info.hdr.back = 0
hdr.info.hdr.magic = 0x3bee
hdr.info.crc = 0xb5af0bc6 (correct)
hdr.info.bno = 105448
hdr.info.lsn = 0x100000900
               ^^^^^^^^^^^
hdr.info.uuid = df395e51-58ac-44b6-8dc5-e7104409bc17
hdr.info.owner = 131203
hdr.count = 2
hdr.usedbytes = 120
hdr.firstused = 3796
hdr.holes = 1
hdr.freemap[0-2] = [base,size]

Note the LSN stamped into the buffer on disk: 1/0x900. The version
on disk is much newer than the log transaction that was being
replayed. That's a bug, and should -never- happen.

So I immediately went to look at xlog_recover_get_buf_lsn() to check
that we handled the LSN correctly. I was wondering if there was a
similar "two commits with the same start LSN skips the second
replay" problem with buffers. I didn't get that far, because I found
a much more basic, rudimentary bug: xlog_recover_get_buf_lsn()
doesn't recognise buffers with XFS_ATTR3_LEAF_MAGIC set in them!!!

IOWs, attr3 leaf buffers fall through the magic number checks
unrecognised, so trigger the "recover immediately" behaviour instead
of undergoing an LSN check. IOWs, we incorrectly replay ATTR3 leaf
buffers and that causes silent on disk corruption of inode attribute
forks and potentially other things....

Git history shows this is *another* zero day bug, this time
introduced in commit 50d5c8d8e9 ("xfs: check LSN ordering for v5
superblocks during recovery") which failed to handle the attr3 leaf
buffers in recovery. And we've failed to handle them ever since...

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-03 12:00:51 +02:00
..
9p 9p: missing chunk of "fs/9p: Don't update file type when updating file attributes" 2022-06-22 14:13:12 +02:00
adfs Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-10-24 12:26:05 -07:00
affs fs/affs: release old buffer head on error path 2021-03-04 11:38:37 +01:00
afs afs: Fix dynamic root getattr 2022-06-29 08:59:49 +02:00
autofs autofs: harden ioctl table 2020-10-16 11:11:22 -07:00
befs [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
bfs bfs: don't use WARNING: string when it's just info. 2021-01-06 14:56:52 +01:00
btrfs btrfs: return -EAGAIN for NOWAIT dio reads/writes on compressed and inline extents 2022-07-21 21:20:01 +02:00
cachefiles fs/cachefiles: Remove wait_bit_key layout dependency 2021-03-30 14:32:07 +02:00
ceph ceph: allow ceph.dir.rctime xattr to be updatable 2022-06-14 18:32:44 +02:00
cifs cifs: fix reconnect on smb3 mount types 2022-06-14 18:32:45 +02:00
coda docs: filesystems: convert coda.txt to ReST 2020-05-05 09:22:21 -06:00
configfs configfs: fix a race in configfs_{,un}register_subsystem() 2022-03-02 11:42:52 +01:00
cramfs [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
crypto fscrypt: allow 256-bit master keys with AES-256-XTS 2021-11-18 14:03:54 +01:00
debugfs debugfs: lockdown: Allow reading debugfs files that are not world readable 2022-01-27 10:54:02 +01:00
devpts fsnotify: fix fsnotify hooks in pseudo filesystems 2022-02-01 17:25:39 +01:00
dlm dlm: fix pending remove if msg allocation fails 2022-07-29 17:19:24 +02:00
ecryptfs Revert "ecryptfs: replace BUG_ON with error handling code" 2021-05-26 12:06:55 +02:00
efivarfs efivarfs: revert "fix memory leak in efivarfs_create()" 2020-11-25 16:55:02 +01:00
efs [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
erofs erofs: fix deadlock when shrink erofs slab 2021-12-01 09:19:05 +01:00
exfat exfat: check if cluster num is valid 2022-06-06 08:42:42 +02:00
exportfs exportfs_decode_fh(): negative pinned may become positive without the parent locked 2019-11-10 11:56:05 -05:00
ext2 ext2: correct max file size computing 2022-04-08 14:40:18 +02:00
ext4 ext4: fix race condition between ext4_write and ext4_convert_inline_data 2022-07-21 21:20:02 +02:00
f2fs f2fs: attach inline_data after setting compression 2022-06-29 08:59:51 +02:00
fat fat: add ratelimit to fat*_ent_bread() 2022-06-09 10:20:58 +02:00
freevxfs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
fscache fscache: Fix cookie key hashing 2021-09-18 13:40:15 +02:00
fuse fuse: fix pipe buffer lifetime for direct_io 2022-03-16 14:16:01 +01:00
gfs2 gfs2: use i_lock spin_lock for inode qadata 2022-06-09 10:20:57 +02:00
hfs hfs: add lock nesting notation to hfs_find_init 2021-07-31 08:16:12 +02:00
hfsplus hfsplus: prevent corruption in shrinking truncate 2021-05-19 10:13:10 +02:00
hostfs hostfs: fix memory handling in follow_link() 2021-04-14 08:42:06 +02:00
hpfs [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
hugetlbfs mm, hugetlb: allow for "high" userspace addresses 2022-04-27 13:53:54 +02:00
iomap xfs: use current->journal_info for detecting transaction recursion 2022-07-07 17:52:19 +02:00
isofs isofs: Fix out of bound access for corrupted isofs image 2021-11-12 14:58:33 +01:00
jbd2 jbd2: fix a potential race while discarding reserved buffers after an abort 2022-04-27 13:53:57 +02:00
jffs2 jffs2: fix memory leak in jffs2_do_fill_super 2022-06-14 18:32:35 +02:00
jfs fs: jfs: fix possible NULL pointer dereference in dbFree() 2022-06-09 10:20:57 +02:00
kernfs kernfs: Separate kernfs_pr_cont_buf and rename_lock. 2022-06-14 18:32:43 +02:00
lockd lockd: lockd server-side shouldn't set fl_ops 2021-09-18 13:40:30 +02:00
minix minix: fix bug when opening a file with O_DIRECT 2022-04-13 21:01:01 +02:00
nfs pNFS: Avoid a live lock condition in pnfs_update_layout() 2022-06-22 14:13:16 +02:00
nfs_common nfs_common: need lock during iterate through the list 2020-12-30 11:53:45 +01:00
nfsd NFSD: restore EINVAL error translation in nfsd_commit() 2022-07-07 17:52:17 +02:00
nilfs2 nilfs2: fix incorrect masking of permission flags for symlinks 2022-07-21 21:20:01 +02:00
nls treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
notify fsnotify: fix wrong lockdep annotations 2022-06-09 10:21:03 +02:00
ntfs ntfs: fix use-after-free in ntfs_ucsncmp() 2022-08-03 12:00:43 +02:00
ocfs2 Revert "ocfs2: mount shared volume without ha stack" 2022-08-03 12:00:43 +02:00
omfs fs: omfs: use kmemdup() rather than kmalloc+memcpy 2020-09-22 23:39:45 -04:00
openpromfs Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
orangefs orangefs: Fix the size of a memory allocation in orangefs_bufmap_alloc() 2022-01-20 09:17:50 +01:00
overlayfs ovl: fix warning in ovl_create_real() 2021-12-22 09:30:58 +01:00
proc proc: fix dentry/inode overinstantiating under /proc/${pid}/net 2022-06-09 10:21:17 +02:00
pstore pstore: Don't use semaphores in always-atomic-context code 2022-04-08 14:39:56 +02:00
qnx4 qnx4: work around gcc false positive warning bug 2021-09-30 10:11:08 +02:00
qnx6 [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
quota quota: Prevent memory allocation recursion while holding dq_lock 2022-06-22 14:13:14 +02:00
ramfs ramfs: fix nommu mmap with gaps in the page cache 2020-10-16 11:11:22 -07:00
reiserfs reiserfs: check directory items on read from disk 2021-08-12 13:22:19 +02:00
romfs Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-10-24 12:26:05 -07:00
squashfs squashfs: fix divide error in calculate_skip() 2021-05-19 10:13:10 +02:00
sysfs sysfs: Add sysfs_emit and sysfs_emit_at to format sysfs output 2020-10-02 12:02:30 +02:00
sysv [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
tracefs tracefs: Set the group ownership in apply_options() not parse_options() 2022-03-02 11:42:54 +01:00
ubifs ubifs: Rectify space amount budget for mkdir/tmpfile operations 2022-04-13 21:00:53 +02:00
udf udf: Fix NULL ptr deref when converting from inline format 2022-02-01 17:25:39 +01:00
ufs Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-10-24 12:26:05 -07:00
unicode unicode: Add utf8_casefold_hash 2020-09-10 14:03:31 -07:00
vboxsf vboxfs: fix broken legacy mount signature checking 2021-10-17 10:43:33 +02:00
verity fs-verity: fix signed integer overflow with i_size near S64_MAX 2021-10-06 15:55:46 +02:00
xfs xfs: Enforce attr3 buffer recovery order 2022-08-03 12:00:51 +02:00
zonefs zonefs: fix zonefs_iomap_begin() for reads 2022-06-25 15:16:08 +02:00
aio.c aio: fix use-after-free due to missing POLLFREE handling 2021-12-14 11:32:40 +01:00
anon_inodes.c Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
attr.c utimes: Clamp the timestamps in notify_change() 2019-12-08 19:10:50 -05:00
bad_inode.c fs: move the fiemap definitions out of fs.h 2020-06-03 23:16:55 -04:00
binfmt_aout.c exec: Rename flush_old_exec begin_new_exec 2020-05-07 16:55:47 -05:00
binfmt_elf_fdpic.c coredump: Snapshot the vmas in do_coredump 2022-04-08 14:40:44 +02:00
binfmt_elf.c coredump: Use the vma snapshot in fill_files_note 2022-04-08 14:40:45 +02:00
binfmt_em86.c Merge branch 'akpm' (patches from Andrew) 2020-06-04 19:18:29 -07:00
binfmt_flat.c binfmt_flat: do not stop relocating GOT entries prematurely on riscv 2022-06-09 10:20:47 +02:00
binfmt_misc.c binfmt_misc: fix possible deadlock in bm_register_write 2021-03-17 17:06:35 +01:00
binfmt_script.c Merge branch 'akpm' (patches from Andrew) 2020-06-04 19:18:29 -07:00
block_dev.c block: fix a race between del_gendisk and BLKRRPART 2021-06-03 09:00:45 +02:00
buffer.c mm, memcg: rework remote charging API to support nesting 2020-10-18 09:27:09 -07:00
char_dev.c vfs: allow unprivileged whiteout creation 2020-05-14 16:44:23 +02:00
compat_binfmt_elf.c Split the old READ_IMPLIES_EXEC workaround from executable PT_GNU_STACK 2020-06-05 13:45:21 -07:00
coredump.c coredump: Use the vma snapshot in fill_files_note 2022-04-08 14:40:45 +02:00
d_path.c fs: fix NULL dereference due to data race in prepend_path() 2020-10-14 14:54:45 -07:00
dax.c dax: fix cache flush on PMD-mapped pages 2022-06-09 10:21:16 +02:00
dcache.c vfs: Use sequence counter with associated spinlock 2020-07-29 16:14:27 +02:00
dcookies.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
direct-io.c fs: direct-io: fix missing sdio->boundary 2021-04-14 08:41:58 +02:00
drop_caches.c sysctl: pass kernel pointers to ->proc_handler 2020-04-27 02:07:40 -04:00
eventfd.c eventfd: convert to f_op->read_iter() 2020-05-06 22:33:43 -04:00
eventpoll.c fs/epoll: restore waking from ep_done_scan() 2021-05-11 14:47:12 +02:00
exec.c fix race between exit_itimers() and /proc/pid/timers 2022-07-21 21:19:59 +02:00
fcntl.c fcntl: fix potential deadlock for &fasync_struct.fa_lock 2021-09-15 09:50:27 +02:00
fhandle.c fs/handle.c - fix up kerneldoc 2019-08-07 21:51:47 -04:00
file_table.c SUNRPC: Ensure we flush any closed sockets before xs_xprt_free() 2022-05-18 10:23:48 +02:00
file.c fs: fix fd table size alignment properly 2022-04-08 14:40:30 +02:00
filesystems.c fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once() 2020-04-10 15:36:22 -07:00
fs_context.c memcg: charge fs_context and legacy_fs_context 2022-02-08 18:30:36 +01:00
fs_parser.c fs_parse: mark fs_param_bad_value() as static 2020-10-13 18:38:27 -07:00
fs_pin.c switch the remnants of releasing the mountpoint away from fs_pin 2019-07-16 22:52:37 -04:00
fs_struct.c vfs: Use sequence counter with associated spinlock 2020-07-29 16:14:27 +02:00
fs_types.c fs: common implementation of file type 2019-01-21 17:48:13 +01:00
fs-writeback.c fs-writeback: writeback_sb_inodes:Recalculate 'wrote' according skipped pages 2022-06-09 10:21:22 +02:00
fsopen.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
init.c init: add an init_dup helper 2020-08-04 21:02:38 -04:00
inode.c fs: export an inode_update_time helper 2021-11-26 10:39:22 +01:00
internal.h cgroup1: fix leaked context root causing sporadic NULL deref in LTP 2021-07-31 08:16:11 +02:00
io_uring.c io_uring: Use original task for req identity in io_identity_cow() 2022-07-29 17:19:07 +02:00
io-wq.c io-wq: fix wakeup race when adding new work 2021-09-18 13:40:06 +02:00
io-wq.h io_uring: always batch cancel in *cancel_files() 2021-02-13 13:54:56 +01:00
ioctl.c fs: fix an infinite loop in iomap_fiemap 2022-05-25 09:17:54 +02:00
Kconfig tmpfs: disallow CONFIG_TMPFS_INODE64 on alpha 2021-02-17 11:02:21 +01:00
Kconfig.binfmt treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
kernel_read_file.c vfs: check fd has read access in kernel_read_file_from_fd() 2021-10-27 09:56:51 +02:00
libfs.c libfs: fix error cast of negative value in simple_attr_write() 2020-11-22 10:48:22 -08:00
locks.c Revert "nfsd4: a client's own opens needn't prevent delegations" 2021-03-20 10:43:44 +01:00
Makefile Refactored code for 5.10: 2020-10-23 11:33:41 -07:00
mbcache.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
mount.h proc/mounts: add cursor 2020-05-14 16:44:24 +02:00
mpage.c fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
namei.c fsnotify: invalidate dcache before IN_DELETE event 2022-02-01 17:25:48 +01:00
namespace.c fs: warn about impending deprecation of mandatory locks 2021-08-26 08:35:57 -04:00
no-block.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
nsfs.c nsproxy: attach to namespaces via pidfds 2020-05-13 11:41:22 +02:00
open.c open: don't silently ignore unknown O-flags in openat2() 2021-07-14 16:55:59 +02:00
pipe.c pipe: Fix missing lock in pipe_resize_ring() 2022-06-06 08:42:41 +02:00
pnode.c propagate_one(): mnt_set_mountpoint() needs mount_lock 2020-04-27 10:37:14 -04:00
pnode.h mount: fix mounting of detached mounts onto targets that reside on shared mounts 2021-03-17 17:06:13 +01:00
posix_acl.c vfs: clean up posix_acl_permission() logic aroudn MAY_NOT_BLOCK 2020-06-08 11:04:19 -07:00
proc_namespace.c proc mountinfo: make splice available again 2020-12-30 11:54:02 +01:00
read_write.c Refactored code for 5.10: 2020-10-23 11:33:41 -07:00
readdir.c readdir: make sure to verify directory entry for legacy interfaces too 2021-04-21 13:00:54 +02:00
remap_range.c fs/remap: constrain dedupe of EOF blocks 2022-07-21 21:20:01 +02:00
select.c select: Fix indefinitely sleeping task in poll_schedule_timeout() 2022-01-29 10:26:11 +01:00
seq_file.c seq_file: disallow extremely large seq buffer allocations 2021-07-20 16:05:59 +02:00
signalfd.c signalfd: use wake_up_pollfree() 2021-12-14 11:32:40 +01:00
splice.c io_uring-5.10-2020-10-24 2020-10-24 12:40:18 -07:00
stack.c sched/rt, fs: Use CONFIG_PREEMPTION 2019-12-08 14:37:36 +01:00
stat.c stat: fix inconsistency between struct stat and struct compat_stat 2022-04-27 13:53:54 +02:00
statfs.c Add a "nosymfollow" mount option. 2020-08-27 16:06:47 -04:00
super.c vfs: make freeze_super abort when sync_filesystem returns error 2022-02-23 12:00:59 +01:00
sync.c overlayfs update for 5.8 2020-06-09 15:40:50 -07:00
timerfd.c timerfd: Make timerfd_settime() time namespace aware 2020-01-14 12:20:53 +01:00
userfaultfd.c userfaultfd: fix a race between writeprotect and exit_mmap() 2021-10-27 09:56:51 +02:00
utimes.c fs: expose utimes_common 2020-07-31 08:16:01 +02:00
xattr.c fs/xattr.c: fix kernel-doc warnings for setxattr & removexattr 2020-10-13 18:38:27 -07:00