binder_mmap() tries to prevent the creation of overly big binder mappings
by silently truncating the size of the VMA to 4MiB. However, this violates
the API contract of mmap(). If userspace attempts to create a large binder
VMA, and later attempts to unmap that VMA, it will call munmap() on a range
beyond the end of the VMA, which may have been allocated to another VMA in
the meantime. This can lead to userspace memory corruption.
The following sequence of calls leads to a segfault without this commit:
int main(void) {
int binder_fd = open("/dev/binder", O_RDWR);
if (binder_fd == -1) err(1, "open binder");
void *binder_mapping = mmap(NULL, 0x800000UL, PROT_READ, MAP_SHARED,
binder_fd, 0);
if (binder_mapping == MAP_FAILED) err(1, "mmap binder");
void *data_mapping = mmap(NULL, 0x400000UL, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
if (data_mapping == MAP_FAILED) err(1, "mmap data");
munmap(binder_mapping, 0x800000UL);
*(char*)data_mapping = 1;
return 0;
}
Cc: stable@vger.kernel.org
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Todd Kjos <tkjos@google.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20191016150119.154756-1-jannh@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
binder_alloc_buffer_lookup() doesn't exist and is named
"binder_alloc_prepare_to_free()". Correct the code comments to reflect
this.
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20190930201250.139554-1-joel@joelfernandes.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a binder transaction is initiated on a binder device coming from a
binderfs instance, a pointer to the name of the binder device is stashed
in the binder_transaction_log_entry's context_name member. Later on it
is used to print the name in print_binder_transaction_log_entry(). By
the time print_binder_transaction_log_entry() accesses context_name
binderfs_evict_inode() might have already freed the associated memory
thereby causing a UAF. Do the simple thing and prevent this by copying
the name of the binder device instead of stashing a pointer to it.
Reported-by: Jann Horn <jannh@google.com>
Fixes: 03e2e07e38 ("binder: Make transaction_log available in binderfs")
Link: https://lore.kernel.org/r/CAG48ez14Q0-F8LqsvcNbyR2o6gPW8SHXsm4u5jmD9MpsteM2Tw@mail.gmail.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Acked-by: Todd Kjos <tkjos@google.com>
Reviewed-by: Hridya Valsaraju <hridya@google.com>
Link: https://lore.kernel.org/r/20191008130159.10161-1-christian.brauner@ubuntu.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently /sys/kernel/debug/binder/proc contains
the debug data for every binder_proc instance.
This patch makes this information also available
in a binderfs instance mounted with a mount option
"stats=global" in addition to debugfs. The patch does
not affect the presence of the file in debugfs.
If a binderfs instance is mounted at path /dev/binderfs,
this file would be present at /dev/binderfs/binder_logs/proc.
This change provides an alternate way to access this file when debugfs
is not mounted.
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Hridya Valsaraju <hridya@google.com>
Link: https://lore.kernel.org/r/20190903161655.107408-5-hridya@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, the binder transaction log files 'transaction_log'
and 'failed_transaction_log' live in debugfs at the following locations:
/sys/kernel/debug/binder/failed_transaction_log
/sys/kernel/debug/binder/transaction_log
This patch makes these files also available in a binderfs instance
mounted with the mount option "stats=global".
It does not affect the presence of these files in debugfs.
If a binderfs instance is mounted at path /dev/binderfs, the location of
these files will be as follows:
/dev/binderfs/binder_logs/failed_transaction_log
/dev/binderfs/binder_logs/transaction_log
This change provides an alternate option to access these files when
debugfs is not mounted.
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Hridya Valsaraju <hridya@google.com>
Link: https://lore.kernel.org/r/20190903161655.107408-4-hridya@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The following binder stat files currently live in debugfs.
/sys/kernel/debug/binder/state
/sys/kernel/debug/binder/stats
/sys/kernel/debug/binder/transactions
This patch makes these files available in a binderfs instance
mounted with the mount option 'stats=global'. For example, if a binderfs
instance is mounted at path /dev/binderfs, the above files will be
available at the following locations:
/dev/binderfs/binder_logs/state
/dev/binderfs/binder_logs/stats
/dev/binderfs/binder_logs/transactions
This provides a way to access them even when debugfs is not mounted.
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Hridya Valsaraju <hridya@google.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20190903161655.107408-3-hridya@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, all binder state and statistics live in debugfs.
We need this information even when debugfs is not mounted.
This patch adds the mount option 'stats' to enable a binderfs
instance to have binder debug information present in the same.
'stats=global' will enable the global binder statistics. In
the future, 'stats=local' will enable binder statistics local
to the binderfs instance. The two modes 'global' and 'local'
will be mutually exclusive. 'stats=global' option is only available
for a binderfs instance mounted in the initial user namespace.
An attempt to use the option to mount a binderfs instance in
another user namespace will return an EPERM error.
Signed-off-by: Hridya Valsaraju <hridya@google.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20190903161655.107408-2-hridya@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, since each binderfs instance needs its own
private binder devices, every time a binderfs instance is
mounted, all the default binder devices need to be created
via the BINDER_CTL_ADD IOCTL. This patch aims to
add a solution to automatically create the default binder
devices for each binderfs instance that gets mounted.
To achieve this goal, when CONFIG_ANDROID_BINDERFS is set,
the default binder devices specified by CONFIG_ANDROID_BINDER_DEVICES
are created in each binderfs instance instead of global devices
being created by the binder driver.
Co-developed-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Hridya Valsaraju <hridya@google.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Link: https://lore.kernel.org/r/20190808222727.132744-2-hridya@google.com
Link: https://lore.kernel.org/r/20190904110704.8606-2-christian.brauner@ubuntu.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Length of a binderfs device name cannot exceed BINDERFS_MAX_NAME.
This patch adds a check in binderfs_init() to ensure the same
for the default binder devices that will be created in every
binderfs instance.
Co-developed-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Hridya Valsaraju <hridya@google.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Link: https://lore.kernel.org/r/20190808222727.132744-3-hridya@google.com
Link: https://lore.kernel.org/r/20190904110704.8606-3-christian.brauner@ubuntu.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, a transaction to context manager from its own process
is prevented by checking if its binder_proc struct is the same as
that of the sender. However, this would not catch cases where the
process opens the binder device again and uses the new fd to send
a transaction to the context manager.
Reported-by: syzbot+8b3c354d33c4ac78bfad@syzkaller.appspotmail.com
Signed-off-by: Hridya Valsaraju <hridya@google.com>
Acked-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20190715191804.112933-1-hridya@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In case the target node requests a security context, the
extra_buffers_size is increased with the size of the security context.
But, that size is not available for use by regular scatter-gather
buffers; make sure the ending of that buffer is marked correctly.
Acked-by: Todd Kjos <tkjos@google.com>
Fixes: ec74136ded ("binder: create node flag to request sender's security context")
Signed-off-by: Martijn Coenen <maco@android.com>
Cc: stable@vger.kernel.org # 5.1+
Link: https://lore.kernel.org/r/20190709110923.220736-1-maco@android.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The buffer copy functions assumed the caller would ensure
correct alignment and that the memory to be copied was
completely within the binder buffer. There have been
a few cases discovered by syzkallar where a malformed
transaction created by a user could violated the
assumptions and resulted in a BUG_ON.
The fix is to remove the BUG_ON and always return the
error to be handled appropriately by the caller.
Acked-by: Martijn Coenen <maco@android.com>
Reported-by: syzbot+3ae18325f96190606754@syzkaller.appspotmail.com
Fixes: bde4a19fc0 ("binder: use userspace pointer as base of buffer space")
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
syzkallar found a 32-byte memory leak in a rarely executed error
case. The transaction complete work item was not freed if put_user()
failed when writing the BR_TRANSACTION_COMPLETE to the user command
buffer. Fixed by freeing it before put_user() is called.
Reported-by: syzbot+182ce46596c3f2e1eb24@syzkaller.appspotmail.com
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is a race between the binder driver cleaning
up a completed transaction via binder_free_transaction()
and a user calling binder_ioctl(BC_FREE_BUFFER) to
release a buffer. It doesn't matter which is first but
they need to be protected against running concurrently
which can result in a UAF.
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Based on 1 normalized pattern(s):
this software is licensed under the terms of the gnu general public
license version 2 as published by the free software foundation and
may be copied distributed and modified under those terms this
program is distributed in the hope that it will be useful but
without any warranty without even the implied warranty of
merchantability or fitness for a particular purpose see the gnu
general public license for more details
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 285 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141900.642774971@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add SPDX license identifiers to all Make/Kconfig files which:
- Have no license information of any form
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Here is the "real" big set of char/misc driver patches for 5.2-rc1
Loads of different driver subsystem stuff in here, all over the places:
- thunderbolt driver updates
- habanalabs driver updates
- nvmem driver updates
- extcon driver updates
- intel_th driver updates
- mei driver updates
- coresight driver updates
- soundwire driver cleanups and updates
- fastrpc driver updates
- other minor driver updates
- chardev minor fixups
Feels like this tree is getting to be a dumping ground of "small driver
subsystems" these days. Which is fine with me, if it makes things
easier for those subsystem maintainers.
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iGwEABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXNHE2w8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykvyQCYj5vSHQ88yEU+bzwGzQQLOBWYIwCgm5Iku0Y3
f6V3MvRffg4qUp3cGbU=
=R37j
-----END PGP SIGNATURE-----
Merge tag 'char-misc-5.2-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc update part 2 from Greg KH:
"Here is the "real" big set of char/misc driver patches for 5.2-rc1
Loads of different driver subsystem stuff in here, all over the places:
- thunderbolt driver updates
- habanalabs driver updates
- nvmem driver updates
- extcon driver updates
- intel_th driver updates
- mei driver updates
- coresight driver updates
- soundwire driver cleanups and updates
- fastrpc driver updates
- other minor driver updates
- chardev minor fixups
Feels like this tree is getting to be a dumping ground of "small
driver subsystems" these days. Which is fine with me, if it makes
things easier for those subsystem maintainers.
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-5.2-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (255 commits)
intel_th: msu: Add current window tracking
intel_th: msu: Add a sysfs attribute to trigger window switch
intel_th: msu: Correct the block wrap detection
intel_th: Add switch triggering support
intel_th: gth: Factor out trace start/stop
intel_th: msu: Factor out pipeline draining
intel_th: msu: Switch over to scatterlist
intel_th: msu: Replace open-coded list_{first,last,next}_entry variants
intel_th: Only report useful IRQs to subdevices
intel_th: msu: Start handling IRQs
intel_th: pci: Use MSI interrupt signalling
intel_th: Communicate IRQ via resource
intel_th: Add "rtit" source device
intel_th: Skip subdevices if their MMIO is missing
intel_th: Rework resource passing between glue layers and core
intel_th: SPDX-ify the documentation
intel_th: msu: Fix single mode with IOMMU
coresight: funnel: Support static funnel
dt-bindings: arm: coresight: Unify funnel DT binding
coresight: replicator: Add new device id for static replicator
...
When allocating space in the target buffer for the security context,
make sure the extra_buffers_size doesn't overflow. This can only
happen if the given size is invalid, but an overflow can turn it
into a valid size. Fail the transaction if an overflow is detected.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Restore the behavior of locking mmap_sem for reading in
binder_alloc_free_page(), as was first done in commit 3013bf62b6
("binder: reduce mmap_sem write-side lock"). That change was
inadvertently reverted by commit 5cec2d2e58 ("binder: fix race between
munmap() and direct reclaim").
In addition, change the name of the label for the error path to
accurately reflect that we're taking the lock for reading.
Backporting note: This fix is only needed when *both* of the commits
mentioned above are applied. That's an unlikely situation since they
both landed during the development of v5.1 but only one of them is
targeted for stable.
Fixes: 5cec2d2e58 ("binder: fix race between munmap() and direct reclaim")
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Todd Kjos <tkjos@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
An munmap() on a binder device causes binder_vma_close() to be called
which clears the alloc->vma pointer.
If direct reclaim causes binder_alloc_free_page() to be called, there
is a race where alloc->vma is read into a local vma pointer and then
used later after the mm->mmap_sem is acquired. This can result in
calling zap_page_range() with an invalid vma which manifests as a
use-after-free in zap_page_range().
The fix is to check alloc->vma after acquiring the mmap_sem (which we
were acquiring anyway) and skip zap_page_range() if it has changed
to NULL.
Signed-off-by: Todd Kjos <tkjos@google.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The selinux-testsuite found an issue resulting in a BUG_ON()
where a conditional relied on a size_t going negative when
checking the validity of a buffer offset.
Fixes: 7a67a39320 ("binder: add function to copy binder object from buffer")
Reported-by: Paul Moore <paul@paul-moore.com>
Tested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
binder has used write-side mmap_sem semaphore to release memory
mapped at address space of the process. However, right lock to
release pages is down_read, not down_write because page table lock
already protects the race for parallel freeing.
Please do not use mmap_sem write-side lock which is well known
contented lock.
Cc: Todd Kjos <tkjos@google.com>
Cc: Martijn Coenen <maco@android.com>
Cc: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixes sparse issues reported by the kbuild test robot running
on https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
char-misc-testing: bde4a19fc0 ("binder: use userspace pointer as base
of buffer space")
Error output (drivers/android/binder_alloc_selftest.c):
sparse: warning: incorrect type in assignment (different address spaces)
sparse: expected void *page_addr
sparse: got void [noderef] <asn:1> *user_data
sparse: error: subtraction of different types can't work
Fixed by adding necessary "__user" tags.
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that alloc->buffer points to the userspace vm_area
rename buffer->data to buffer->user_data and rename
local pointers that hold user addresses. Also use the
"__user" tag to annotate all user pointers so sparse
can flag cases where user pointer vaues are copied to
kernel pointers. Refactor code to use offsets instead
of user pointers.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Remove user_buffer_offset since there is no kernel
buffer pointer anymore.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Remove the kernel's vm_area and the code that maps
buffer pages into it.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Refactor the functions to validate and fixup struct
binder_buffer pointer objects to avoid using vm_area
pointers. Instead copy to/from kernel space using
binder_alloc_copy_to_buffer() and
binder_alloc_copy_from_buffer(). The following
functions were refactored:
refactor binder_validate_ptr()
binder_validate_fixup()
binder_fixup_parent()
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When creating or tearing down a transaction, the binder driver
examines objects in the buffer and takes appropriate action.
To do this without needing to dereference pointers into the
buffer, the local copies of the objects are needed. This patch
introduces a function to validate and copy binder objects
from the buffer to a local structure.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Avoid vm_area when copying to or from binder buffers.
Instead, new copy functions are added that copy from
kernel space to binder buffer space. These use
kmap_atomic() and kunmap_atomic() to create temporary
mappings and then memcpy() is used to copy within
that page.
Also, kmap_atomic() / kunmap_atomic() use the appropriate
cache flushing to support VIVT cache architectures.
Allow binder to build if CPU_CACHE_VIVT is defined.
Several uses of the new functions are added here. More
to follow in subsequent patches.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The binder driver uses a vm_area to map the per-process
binder buffer space. For 32-bit android devices, this is
now taking too much vmalloc space. This patch removes
the use of vm_area when copying the transaction data
from the sender to the buffer space. Instead of using
copy_from_user() for multi-page copies, it now uses
binder_alloc_copy_user_to_buffer() which uses kmap()
and kunmap() to map each page, and uses copy_from_user()
for copying to that page.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
binderfs should not have a separate device_initcall(). When a kernel is
compiled with CONFIG_ANDROID_BINDERFS register the filesystem alongside
CONFIG_ANDROID_IPC. This use-case is especially sensible when users specify
CONFIG_ANDROID_IPC=y, CONFIG_ANDROID_BINDERFS=y and
ANDROID_BINDER_DEVICES="".
When CONFIG_ANDROID_BINDERFS=n then this always succeeds so there's no
regression potential for legacy workloads.
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We currently adhere to the reserved devices limit when creating new
binderfs devices in binderfs instances not located in the inital ipc
namespace. But it is still possible to rob the host instances of their 4
reserved devices by creating the maximum allowed number of devices in a
single binderfs instance located in a non-initial ipc namespace and then
mounting 4 separate binderfs instances in non-initial ipc namespaces. That
happens because the limit is currently not respected for the creation of
the initial binder-control device node. Block this nonsense by performing
the same check in binderfs_binder_ctl_create() that we perform in
binderfs_binder_device_create().
Fixes: 36bdf3cae0 ("binderfs: reserve devices for initial mount")
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To allow servers to verify client identity, allow a node
flag to be set that causes the sender's security context
to be delivered with the transaction. The BR_TRANSACTION
command is extended in BR_TRANSACTION_SEC_CTX to
contain a pointer to the security context string.
Signed-off-by: Todd Kjos <tkjos@google.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The binderfs_binder_ctl_create() call is a no-op on subsequent calls and
the first call is done before we unlock the suberblock. Hence, there is no
need to take inode_lock() in there. Let's remove it.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Al pointed out that first calling kill_litter_super() before cleaning up
info is more correct since destroying info doesn't depend on the state of
the dentries and inodes. That the opposite remains true is not guaranteed.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- switch from d_alloc_name() + d_lookup() to lookup_one_len():
Instead of using d_alloc_name() and then doing a d_lookup() with the
allocated dentry to find whether a device with the name we're trying to
create already exists switch to using lookup_one_len(). The latter will
either return the existing dentry or a new one.
- switch from kmalloc() + strscpy() to kmemdup():
Use a more idiomatic way to copy the name for the new dentry that
userspace gave us.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Al pointed out that on binderfs_fill_super() error
deactivate_locked_super() will call binderfs_kill_super() so all of the
freeing and putting we currently do in binderfs_fill_super() is unnecessary
and buggy. Let's simply return errors and let binderfs_fill_super() take
care of cleaning up on error.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- make binderfs control dentry immutable:
We don't allow to unlink it since it is crucial for binderfs to be
useable but if we allow to rename it we make the unlink trivial to
bypass. So prevent renaming too and simply treat the control dentry as
immutable.
- add is_binderfs_control_device() helper:
Take the opportunity and turn the check for the control dentry into a
separate helper is_binderfs_control_device() since it's now used in two
places.
- simplify binderfs_rename():
Instead of hand-rolling our custom version of simple_rename() just dumb
the whole function down to first check whether we're trying to rename the
control dentry. If we do EPERM the caller and if not call simple_rename().
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The comment stems from an early version of that patchset and is just
confusing now.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix to return a negative error code -ENOMEM from the new_inode() and
d_make_root() error handling cases instead of 0, as done elsewhere in
this function.
Fixes: 849d540ddf ("binderfs: implement "max" mount option")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kbuild reported a build faile in [1]. This is triggered when CONFIG_IPC_NS
is not set. So let's make the use of init_ipc_ns conditional on
CONFIG_IPC_NS being set.
[1]: https://lists.01.org/pipermail/kbuild-all/2019-January/056903.html
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The binderfs instance in the initial ipc namespace will always have a
reserve of 4 binder devices unless explicitly capped by specifying a lower
value via the "max" mount option.
This ensures when binder devices are removed (on accident or on purpose)
they can always be recreated without risking that all minor numbers have
already been used up.
Cc: Todd Kjos <tkjos@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It doesn't make sense to call the header binder_ctl.h when its sole
existence is tied to binderfs. So give it a sensible name. Users will far
more easily remember binderfs.h than binder_ctl.h.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since binderfs can be mounted by userns root in non-initial user namespaces
some precautions are in order. First, a way to set a maximum on the number
of binder devices that can be allocated per binderfs instance and second, a
way to reserve a reasonable chunk of binderfs devices for the initial ipc
namespace.
A first approach as seen in [1] used sysctls similiar to devpts but was
shown to be flawed (cf. [2] and [3]) since some aspects were unneeded. This
is an alternative approach which avoids sysctls completely and instead
switches to a single mount option.
Starting with this commit binderfs instances can be mounted with a limit on
the number of binder devices that can be allocated. The max=<count> mount
option serves as a per-instance limit. If max=<count> is set then only
<count> number of binder devices can be allocated in this binderfs
instance.
This allows to safely bind-mount binderfs instances into unprivileged user
namespaces since userns root in a non-initial user namespace cannot change
the mount option as long as it does not own the mount namespace the
binderfs mount was created in and hence cannot drain the host of minor
device numbers
[1]: https://lore.kernel.org/lkml/20181221133909.18794-1-christian@brauner.io/
[2]; https://lore.kernel.org/lkml/20181221163316.GA8517@kroah.com/
[3]: https://lore.kernel.org/lkml/CAHRSSEx+gDVW4fKKK8oZNAir9G5icJLyodO8hykv3O0O1jt2FQ@mail.gmail.com/
[4]: https://lore.kernel.org/lkml/20181221192044.5yvfnuri7gdop4rs@brauner.io/
Cc: Todd Kjos <tkjos@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>