forked from luck/tmp_suning_uos_patched
c9b1d0981f
Add user_reserve_kbytes knob. Limit the growth of the memory reserved for other user processes to min(3% current process size, user_reserve_pages). Only about 8MB is necessary to enable recovery in the default mode, and only a few hundred MB are required even when overcommit is disabled. user_reserve_pages defaults to min(3% free pages, 128MB) I arrived at 128MB by taking the max VSZ of sshd, login, bash, and top ... then adding the RSS of each. This only affects OVERCOMMIT_NEVER mode. Background 1. user reserve __vm_enough_memory reserves a hardcoded 3% of the current process size for other applications when overcommit is disabled. This was done so that a user could recover if they launched a memory hogging process. Without the reserve, a user would easily run into a message such as: bash: fork: Cannot allocate memory 2. admin reserve Additionally, a hardcoded 3% of free memory is reserved for root in both overcommit 'guess' and 'never' modes. This was intended to prevent a scenario where root-cant-log-in and perform recovery operations. Note that this reserve shrinks, and doesn't guarantee a useful reserve. Motivation The two hardcoded memory reserves should be updated to account for current memory sizes. Also, the admin reserve would be more useful if it didn't shrink too much. When the current code was originally written, 1GB was considered "enterprise". Now the 3% reserve can grow to multiple GB on large memory systems, and it only needs to be a few hundred MB at most to enable a user or admin to recover a system with an unwanted memory hogging process. I've found that reducing these reserves is especially beneficial for a specific type of application load: * single application system * one or few processes (e.g. one per core) * allocating all available memory * not initializing every page immediately * long running I've run scientific clusters with this sort of load. A long running job sometimes failed many hours (weeks of CPU time) into a calculation. They weren't initializing all of their memory immediately, and they weren't using calloc, so I put systems into overcommit 'never' mode. These clusters run diskless and have no swap. However, with the current reserves, a user wishing to allocate as much memory as possible to one process may be prevented from using, for example, almost 2GB out of 32GB. The effect is less, but still significant when a user starts a job with one process per core. I have repeatedly seen a set of processes requesting the same amount of memory fail because one of them could not allocate the amount of memory a user would expect to be able to allocate. For example, Message Passing Interfce (MPI) processes, one per core. And it is similar for other parallel programming frameworks. Changing this reserve code will make the overcommit never mode more useful by allowing applications to allocate nearly all of the available memory. Also, the new admin_reserve_kbytes will be safer than the current behavior since the hardcoded 3% of available memory reserve can shrink to something useless in the case where applications have grabbed all available memory. Risks * "bash: fork: Cannot allocate memory" The downside of the first patch-- which creates a tunable user reserve that is only used in overcommit 'never' mode--is that an admin can set it so low that a user may not be able to kill their process, even if they already have a shell prompt. Of course, a user can get in the same predicament with the current 3% reserve--they just have to launch processes until 3% becomes negligible. * root-cant-log-in problem The second patch, adding the tunable rootuser_reserve_pages, allows the admin to shoot themselves in the foot by setting it too small. They can easily get the system into a state where root-can't-log-in. However, the new admin_reserve_kbytes will be safer than the current behavior since the hardcoded 3% of available memory reserve can shrink to something useless in the case where applications have grabbed all available memory. Alternatives * Memory cgroups provide a more flexible way to limit application memory. Not everyone wants to set up cgroups or deal with their overhead. * We could create a fourth overcommit mode which provides smaller reserves. The size of useful reserves may be drastically different depending on the whether the system is embedded or enterprise. * Force users to initialize all of their memory or use calloc. Some users don't want/expect the system to overcommit when they malloc. Overcommit 'never' mode is for this scenario, and it should work well. The new user and admin reserve tunables are simple to use, with low overhead compared to cgroups. The patches preserve current behavior where 3% of memory is less than 128MB, except that the admin reserve doesn't shrink to an unusable size under pressure. The code allows admins to tune for embedded and enterprise usage. FAQ * How is the root-cant-login problem addressed? What happens if admin_reserve_pages is set to 0? Root is free to shoot themselves in the foot by setting admin_reserve_kbytes too low. On x86_64, the minimum useful reserve is: 8MB for overcommit 'guess' 128MB for overcommit 'never' admin_reserve_pages defaults to min(3% free memory, 8MB) So, anyone switching to 'never' mode needs to adjust admin_reserve_pages. * How do you calculate a minimum useful reserve? A user or the admin needs enough memory to login and perform recovery operations, which includes, at a minimum: sshd or login + bash (or some other shell) + top (or ps, kill, etc.) For overcommit 'guess', we can sum resident set sizes (RSS) because we only need enough memory to handle what the recovery programs will typically use. On x86_64 this is about 8MB. For overcommit 'never', we can take the max of their virtual sizes (VSZ) and add the sum of their RSS. We use VSZ instead of RSS because mode forces us to ensure we can fulfill all of the requested memory allocations-- even if the programs only use a fraction of what they ask for. On x86_64 this is about 128MB. When swap is enabled, reserves are useful even when they are as small as 10MB, regardless of overcommit mode. When both swap and overcommit are disabled, then the admin should tune the reserves higher to be absolutley safe. Over 230MB each was safest in my testing. * What happens if user_reserve_pages is set to 0? Note, this only affects overcomitt 'never' mode. Then a user will be able to allocate all available memory minus admin_reserve_kbytes. However, they will easily see a message such as: "bash: fork: Cannot allocate memory" And they won't be able to recover/kill their application. The admin should be able to recover the system if admin_reserve_kbytes is set appropriately. * What's the difference between overcommit 'guess' and 'never'? "Guess" allows an allocation if there are enough free + reclaimable pages. It has a hardcoded 3% of free pages reserved for root. "Never" allows an allocation if there is enough swap + a configurable percentage (default is 50) of physical RAM. It has a hardcoded 3% of free pages reserved for root, like "Guess" mode. It also has a hardcoded 3% of the current process size reserved for additional applications. * Why is overcommit 'guess' not suitable even when an app eventually writes to every page? It takes free pages, file pages, available swap pages, reclaimable slab pages into consideration. In other words, these are all pages available, then why isn't overcommit suitable? Because it only looks at the present state of the system. It does not take into account the memory that other applications have malloced, but haven't initialized yet. It overcommits the system. Test Summary There was little change in behavior in the default overcommit 'guess' mode with swap enabled before and after the patch. This was expected. Systems run most predictably (i.e. no oom kills) in overcommit 'never' mode with swap enabled. This also allowed the most memory to be allocated to a user application. Overcommit 'guess' mode without swap is a bad idea. It is easy to crash the system. None of the other tested combinations crashed. This matches my experience on the Roadrunner supercomputer. Without the tunable user reserve, a system in overcommit 'never' mode and without swap does not allow the admin to recover, although the admin can. With the new tunable reserves, a system in overcommit 'never' mode and without swap can be configured to: 1. maximize user-allocatable memory, running close to the edge of recoverability 2. maximize recoverability, sacrificing allocatable memory to ensure that a user cannot take down a system Test Description Fedora 18 VM - 4 x86_64 cores, 5725MB RAM, 4GB Swap System is booted into multiuser console mode, with unnecessary services turned off. Caches were dropped before each test. Hogs are user memtester processes that attempt to allocate all free memory as reported by /proc/meminfo In overcommit 'never' mode, memory_ratio=100 Test Results 3.9.0-rc1-mm1 Overcommit | Swap | Hogs | MB Got/Wanted | OOMs | User Recovery | Admin Recovery ---------- ---- ---- ------------- ---- ------------- -------------- guess yes 1 5432/5432 no yes yes guess yes 4 5444/5444 1 yes yes guess no 1 5302/5449 no yes yes guess no 4 - crash no no never yes 1 5460/5460 1 yes yes never yes 4 5460/5460 1 yes yes never no 1 5218/5432 no no yes never no 4 5203/5448 no no yes 3.9.0-rc1-mm1-tunablereserves User and Admin Recovery show their respective reserves, if applicable. Overcommit | Swap | Hogs | MB Got/Wanted | OOMs | User Recovery | Admin Recovery ---------- ---- ---- ------------- ---- ------------- -------------- guess yes 1 5419/5419 no - yes 8MB yes guess yes 4 5436/5436 1 - yes 8MB yes guess no 1 5440/5440 * - yes 8MB yes guess no 4 - crash - no 8MB no * process would successfully mlock, then the oom killer would pick it never yes 1 5446/5446 no 10MB yes 20MB yes never yes 4 5456/5456 no 10MB yes 20MB yes never no 1 5387/5429 no 128MB no 8MB barely never no 1 5323/5428 no 226MB barely 8MB barely never no 1 5323/5428 no 226MB barely 8MB barely never no 1 5359/5448 no 10MB no 10MB barely never no 1 5323/5428 no 0MB no 10MB barely never no 1 5332/5428 no 0MB no 50MB yes never no 1 5293/5429 no 0MB no 90MB yes never no 1 5001/5427 no 230MB yes 338MB yes never no 4* 4998/5424 no 230MB yes 338MB yes * more memtesters were launched, able to allocate approximately another 100MB Future Work - Test larger memory systems. - Test an embedded image. - Test other architectures. - Time malloc microbenchmarks. - Would it be useful to be able to set overcommit policy for each memory cgroup? - Some lines are slightly above 80 chars. Perhaps define a macro to convert between pages and kb? Other places in the kernel do this. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: make init_user_reserve() static] Signed-off-by: Andrew Shewmaker <agshew@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2646 lines
60 KiB
C
2646 lines
60 KiB
C
/*
|
|
* sysctl.c: General linux system control interface
|
|
*
|
|
* Begun 24 March 1995, Stephen Tweedie
|
|
* Added /proc support, Dec 1995
|
|
* Added bdflush entry and intvec min/max checking, 2/23/96, Tom Dyas.
|
|
* Added hooks for /proc/sys/net (minor, minor patch), 96/4/1, Mike Shaver.
|
|
* Added kernel/java-{interpreter,appletviewer}, 96/5/10, Mike Shaver.
|
|
* Dynamic registration fixes, Stephen Tweedie.
|
|
* Added kswapd-interval, ctrl-alt-del, printk stuff, 1/8/97, Chris Horn.
|
|
* Made sysctl support optional via CONFIG_SYSCTL, 1/10/97, Chris
|
|
* Horn.
|
|
* Added proc_doulongvec_ms_jiffies_minmax, 09/08/99, Carlos H. Bauer.
|
|
* Added proc_doulongvec_minmax, 09/08/99, Carlos H. Bauer.
|
|
* Changed linked lists to use list.h instead of lists.h, 02/24/00, Bill
|
|
* Wendling.
|
|
* The list_for_each() macro wasn't appropriate for the sysctl loop.
|
|
* Removed it and replaced it with older style, 03/23/00, Bill Wendling
|
|
*/
|
|
|
|
#include <linux/module.h>
|
|
#include <linux/mm.h>
|
|
#include <linux/swap.h>
|
|
#include <linux/slab.h>
|
|
#include <linux/sysctl.h>
|
|
#include <linux/bitmap.h>
|
|
#include <linux/signal.h>
|
|
#include <linux/printk.h>
|
|
#include <linux/proc_fs.h>
|
|
#include <linux/security.h>
|
|
#include <linux/ctype.h>
|
|
#include <linux/kmemcheck.h>
|
|
#include <linux/kmemleak.h>
|
|
#include <linux/fs.h>
|
|
#include <linux/init.h>
|
|
#include <linux/kernel.h>
|
|
#include <linux/kobject.h>
|
|
#include <linux/net.h>
|
|
#include <linux/sysrq.h>
|
|
#include <linux/highuid.h>
|
|
#include <linux/writeback.h>
|
|
#include <linux/ratelimit.h>
|
|
#include <linux/compaction.h>
|
|
#include <linux/hugetlb.h>
|
|
#include <linux/initrd.h>
|
|
#include <linux/key.h>
|
|
#include <linux/times.h>
|
|
#include <linux/limits.h>
|
|
#include <linux/dcache.h>
|
|
#include <linux/dnotify.h>
|
|
#include <linux/syscalls.h>
|
|
#include <linux/vmstat.h>
|
|
#include <linux/nfs_fs.h>
|
|
#include <linux/acpi.h>
|
|
#include <linux/reboot.h>
|
|
#include <linux/ftrace.h>
|
|
#include <linux/perf_event.h>
|
|
#include <linux/kprobes.h>
|
|
#include <linux/pipe_fs_i.h>
|
|
#include <linux/oom.h>
|
|
#include <linux/kmod.h>
|
|
#include <linux/capability.h>
|
|
#include <linux/binfmts.h>
|
|
#include <linux/sched/sysctl.h>
|
|
|
|
#include <asm/uaccess.h>
|
|
#include <asm/processor.h>
|
|
|
|
#ifdef CONFIG_X86
|
|
#include <asm/nmi.h>
|
|
#include <asm/stacktrace.h>
|
|
#include <asm/io.h>
|
|
#endif
|
|
#ifdef CONFIG_SPARC
|
|
#include <asm/setup.h>
|
|
#endif
|
|
#ifdef CONFIG_BSD_PROCESS_ACCT
|
|
#include <linux/acct.h>
|
|
#endif
|
|
#ifdef CONFIG_RT_MUTEXES
|
|
#include <linux/rtmutex.h>
|
|
#endif
|
|
#if defined(CONFIG_PROVE_LOCKING) || defined(CONFIG_LOCK_STAT)
|
|
#include <linux/lockdep.h>
|
|
#endif
|
|
#ifdef CONFIG_CHR_DEV_SG
|
|
#include <scsi/sg.h>
|
|
#endif
|
|
|
|
#ifdef CONFIG_LOCKUP_DETECTOR
|
|
#include <linux/nmi.h>
|
|
#endif
|
|
|
|
|
|
#if defined(CONFIG_SYSCTL)
|
|
|
|
/* External variables not in a header file. */
|
|
extern int sysctl_overcommit_memory;
|
|
extern int sysctl_overcommit_ratio;
|
|
extern int max_threads;
|
|
extern int suid_dumpable;
|
|
#ifdef CONFIG_COREDUMP
|
|
extern int core_uses_pid;
|
|
extern char core_pattern[];
|
|
extern unsigned int core_pipe_limit;
|
|
#endif
|
|
extern int pid_max;
|
|
extern int pid_max_min, pid_max_max;
|
|
extern int percpu_pagelist_fraction;
|
|
extern int compat_log;
|
|
extern int latencytop_enabled;
|
|
extern int sysctl_nr_open_min, sysctl_nr_open_max;
|
|
#ifndef CONFIG_MMU
|
|
extern int sysctl_nr_trim_pages;
|
|
#endif
|
|
#ifdef CONFIG_BLOCK
|
|
extern int blk_iopoll_enabled;
|
|
#endif
|
|
|
|
/* Constants used for minimum and maximum */
|
|
#ifdef CONFIG_LOCKUP_DETECTOR
|
|
static int sixty = 60;
|
|
static int neg_one = -1;
|
|
#endif
|
|
|
|
static int zero;
|
|
static int __maybe_unused one = 1;
|
|
static int __maybe_unused two = 2;
|
|
static int __maybe_unused three = 3;
|
|
static unsigned long one_ul = 1;
|
|
static int one_hundred = 100;
|
|
#ifdef CONFIG_PRINTK
|
|
static int ten_thousand = 10000;
|
|
#endif
|
|
|
|
/* this is needed for the proc_doulongvec_minmax of vm_dirty_bytes */
|
|
static unsigned long dirty_bytes_min = 2 * PAGE_SIZE;
|
|
|
|
/* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID */
|
|
static int maxolduid = 65535;
|
|
static int minolduid;
|
|
static int min_percpu_pagelist_fract = 8;
|
|
|
|
static int ngroups_max = NGROUPS_MAX;
|
|
static const int cap_last_cap = CAP_LAST_CAP;
|
|
|
|
#ifdef CONFIG_INOTIFY_USER
|
|
#include <linux/inotify.h>
|
|
#endif
|
|
#ifdef CONFIG_SPARC
|
|
#endif
|
|
|
|
#ifdef CONFIG_SPARC64
|
|
extern int sysctl_tsb_ratio;
|
|
#endif
|
|
|
|
#ifdef __hppa__
|
|
extern int pwrsw_enabled;
|
|
#endif
|
|
|
|
#ifdef CONFIG_SYSCTL_ARCH_UNALIGN_ALLOW
|
|
extern int unaligned_enabled;
|
|
#endif
|
|
|
|
#ifdef CONFIG_IA64
|
|
extern int unaligned_dump_stack;
|
|
#endif
|
|
|
|
#ifdef CONFIG_SYSCTL_ARCH_UNALIGN_NO_WARN
|
|
extern int no_unaligned_warning;
|
|
#endif
|
|
|
|
#ifdef CONFIG_PROC_SYSCTL
|
|
static int proc_do_cad_pid(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos);
|
|
static int proc_taint(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos);
|
|
#endif
|
|
|
|
#ifdef CONFIG_PRINTK
|
|
static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos);
|
|
#endif
|
|
|
|
static int proc_dointvec_minmax_coredump(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos);
|
|
#ifdef CONFIG_COREDUMP
|
|
static int proc_dostring_coredump(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos);
|
|
#endif
|
|
|
|
#ifdef CONFIG_MAGIC_SYSRQ
|
|
/* Note: sysrq code uses it's own private copy */
|
|
static int __sysrq_enabled = SYSRQ_DEFAULT_ENABLE;
|
|
|
|
static int sysrq_sysctl_handler(ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp,
|
|
loff_t *ppos)
|
|
{
|
|
int error;
|
|
|
|
error = proc_dointvec(table, write, buffer, lenp, ppos);
|
|
if (error)
|
|
return error;
|
|
|
|
if (write)
|
|
sysrq_toggle_support(__sysrq_enabled);
|
|
|
|
return 0;
|
|
}
|
|
|
|
#endif
|
|
|
|
static struct ctl_table kern_table[];
|
|
static struct ctl_table vm_table[];
|
|
static struct ctl_table fs_table[];
|
|
static struct ctl_table debug_table[];
|
|
static struct ctl_table dev_table[];
|
|
extern struct ctl_table random_table[];
|
|
#ifdef CONFIG_EPOLL
|
|
extern struct ctl_table epoll_table[];
|
|
#endif
|
|
|
|
#ifdef HAVE_ARCH_PICK_MMAP_LAYOUT
|
|
int sysctl_legacy_va_layout;
|
|
#endif
|
|
|
|
/* The default sysctl tables: */
|
|
|
|
static struct ctl_table sysctl_base_table[] = {
|
|
{
|
|
.procname = "kernel",
|
|
.mode = 0555,
|
|
.child = kern_table,
|
|
},
|
|
{
|
|
.procname = "vm",
|
|
.mode = 0555,
|
|
.child = vm_table,
|
|
},
|
|
{
|
|
.procname = "fs",
|
|
.mode = 0555,
|
|
.child = fs_table,
|
|
},
|
|
{
|
|
.procname = "debug",
|
|
.mode = 0555,
|
|
.child = debug_table,
|
|
},
|
|
{
|
|
.procname = "dev",
|
|
.mode = 0555,
|
|
.child = dev_table,
|
|
},
|
|
{ }
|
|
};
|
|
|
|
#ifdef CONFIG_SCHED_DEBUG
|
|
static int min_sched_granularity_ns = 100000; /* 100 usecs */
|
|
static int max_sched_granularity_ns = NSEC_PER_SEC; /* 1 second */
|
|
static int min_wakeup_granularity_ns; /* 0 usecs */
|
|
static int max_wakeup_granularity_ns = NSEC_PER_SEC; /* 1 second */
|
|
#ifdef CONFIG_SMP
|
|
static int min_sched_tunable_scaling = SCHED_TUNABLESCALING_NONE;
|
|
static int max_sched_tunable_scaling = SCHED_TUNABLESCALING_END-1;
|
|
#endif /* CONFIG_SMP */
|
|
#endif /* CONFIG_SCHED_DEBUG */
|
|
|
|
#ifdef CONFIG_COMPACTION
|
|
static int min_extfrag_threshold;
|
|
static int max_extfrag_threshold = 1000;
|
|
#endif
|
|
|
|
static struct ctl_table kern_table[] = {
|
|
{
|
|
.procname = "sched_child_runs_first",
|
|
.data = &sysctl_sched_child_runs_first,
|
|
.maxlen = sizeof(unsigned int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#ifdef CONFIG_SCHED_DEBUG
|
|
{
|
|
.procname = "sched_min_granularity_ns",
|
|
.data = &sysctl_sched_min_granularity,
|
|
.maxlen = sizeof(unsigned int),
|
|
.mode = 0644,
|
|
.proc_handler = sched_proc_update_handler,
|
|
.extra1 = &min_sched_granularity_ns,
|
|
.extra2 = &max_sched_granularity_ns,
|
|
},
|
|
{
|
|
.procname = "sched_latency_ns",
|
|
.data = &sysctl_sched_latency,
|
|
.maxlen = sizeof(unsigned int),
|
|
.mode = 0644,
|
|
.proc_handler = sched_proc_update_handler,
|
|
.extra1 = &min_sched_granularity_ns,
|
|
.extra2 = &max_sched_granularity_ns,
|
|
},
|
|
{
|
|
.procname = "sched_wakeup_granularity_ns",
|
|
.data = &sysctl_sched_wakeup_granularity,
|
|
.maxlen = sizeof(unsigned int),
|
|
.mode = 0644,
|
|
.proc_handler = sched_proc_update_handler,
|
|
.extra1 = &min_wakeup_granularity_ns,
|
|
.extra2 = &max_wakeup_granularity_ns,
|
|
},
|
|
#ifdef CONFIG_SMP
|
|
{
|
|
.procname = "sched_tunable_scaling",
|
|
.data = &sysctl_sched_tunable_scaling,
|
|
.maxlen = sizeof(enum sched_tunable_scaling),
|
|
.mode = 0644,
|
|
.proc_handler = sched_proc_update_handler,
|
|
.extra1 = &min_sched_tunable_scaling,
|
|
.extra2 = &max_sched_tunable_scaling,
|
|
},
|
|
{
|
|
.procname = "sched_migration_cost_ns",
|
|
.data = &sysctl_sched_migration_cost,
|
|
.maxlen = sizeof(unsigned int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "sched_nr_migrate",
|
|
.data = &sysctl_sched_nr_migrate,
|
|
.maxlen = sizeof(unsigned int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "sched_time_avg_ms",
|
|
.data = &sysctl_sched_time_avg,
|
|
.maxlen = sizeof(unsigned int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "sched_shares_window_ns",
|
|
.data = &sysctl_sched_shares_window,
|
|
.maxlen = sizeof(unsigned int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "timer_migration",
|
|
.data = &sysctl_timer_migration,
|
|
.maxlen = sizeof(unsigned int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &zero,
|
|
.extra2 = &one,
|
|
},
|
|
#endif /* CONFIG_SMP */
|
|
#ifdef CONFIG_NUMA_BALANCING
|
|
{
|
|
.procname = "numa_balancing_scan_delay_ms",
|
|
.data = &sysctl_numa_balancing_scan_delay,
|
|
.maxlen = sizeof(unsigned int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "numa_balancing_scan_period_min_ms",
|
|
.data = &sysctl_numa_balancing_scan_period_min,
|
|
.maxlen = sizeof(unsigned int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "numa_balancing_scan_period_reset",
|
|
.data = &sysctl_numa_balancing_scan_period_reset,
|
|
.maxlen = sizeof(unsigned int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "numa_balancing_scan_period_max_ms",
|
|
.data = &sysctl_numa_balancing_scan_period_max,
|
|
.maxlen = sizeof(unsigned int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "numa_balancing_scan_size_mb",
|
|
.data = &sysctl_numa_balancing_scan_size,
|
|
.maxlen = sizeof(unsigned int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif /* CONFIG_NUMA_BALANCING */
|
|
#endif /* CONFIG_SCHED_DEBUG */
|
|
{
|
|
.procname = "sched_rt_period_us",
|
|
.data = &sysctl_sched_rt_period,
|
|
.maxlen = sizeof(unsigned int),
|
|
.mode = 0644,
|
|
.proc_handler = sched_rt_handler,
|
|
},
|
|
{
|
|
.procname = "sched_rt_runtime_us",
|
|
.data = &sysctl_sched_rt_runtime,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = sched_rt_handler,
|
|
},
|
|
{
|
|
.procname = "sched_rr_timeslice_ms",
|
|
.data = &sched_rr_timeslice,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = sched_rr_handler,
|
|
},
|
|
#ifdef CONFIG_SCHED_AUTOGROUP
|
|
{
|
|
.procname = "sched_autogroup_enabled",
|
|
.data = &sysctl_sched_autogroup_enabled,
|
|
.maxlen = sizeof(unsigned int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &zero,
|
|
.extra2 = &one,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_CFS_BANDWIDTH
|
|
{
|
|
.procname = "sched_cfs_bandwidth_slice_us",
|
|
.data = &sysctl_sched_cfs_bandwidth_slice,
|
|
.maxlen = sizeof(unsigned int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &one,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_PROVE_LOCKING
|
|
{
|
|
.procname = "prove_locking",
|
|
.data = &prove_locking,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_LOCK_STAT
|
|
{
|
|
.procname = "lock_stat",
|
|
.data = &lock_stat,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
{
|
|
.procname = "panic",
|
|
.data = &panic_timeout,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#ifdef CONFIG_COREDUMP
|
|
{
|
|
.procname = "core_uses_pid",
|
|
.data = &core_uses_pid,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "core_pattern",
|
|
.data = core_pattern,
|
|
.maxlen = CORENAME_MAX_SIZE,
|
|
.mode = 0644,
|
|
.proc_handler = proc_dostring_coredump,
|
|
},
|
|
{
|
|
.procname = "core_pipe_limit",
|
|
.data = &core_pipe_limit,
|
|
.maxlen = sizeof(unsigned int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_PROC_SYSCTL
|
|
{
|
|
.procname = "tainted",
|
|
.maxlen = sizeof(long),
|
|
.mode = 0644,
|
|
.proc_handler = proc_taint,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_LATENCYTOP
|
|
{
|
|
.procname = "latencytop",
|
|
.data = &latencytop_enabled,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_BLK_DEV_INITRD
|
|
{
|
|
.procname = "real-root-dev",
|
|
.data = &real_root_dev,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
{
|
|
.procname = "print-fatal-signals",
|
|
.data = &print_fatal_signals,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#ifdef CONFIG_SPARC
|
|
{
|
|
.procname = "reboot-cmd",
|
|
.data = reboot_command,
|
|
.maxlen = 256,
|
|
.mode = 0644,
|
|
.proc_handler = proc_dostring,
|
|
},
|
|
{
|
|
.procname = "stop-a",
|
|
.data = &stop_a_enabled,
|
|
.maxlen = sizeof (int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "scons-poweroff",
|
|
.data = &scons_pwroff,
|
|
.maxlen = sizeof (int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_SPARC64
|
|
{
|
|
.procname = "tsb-ratio",
|
|
.data = &sysctl_tsb_ratio,
|
|
.maxlen = sizeof (int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
#ifdef __hppa__
|
|
{
|
|
.procname = "soft-power",
|
|
.data = &pwrsw_enabled,
|
|
.maxlen = sizeof (int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_SYSCTL_ARCH_UNALIGN_ALLOW
|
|
{
|
|
.procname = "unaligned-trap",
|
|
.data = &unaligned_enabled,
|
|
.maxlen = sizeof (int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
{
|
|
.procname = "ctrl-alt-del",
|
|
.data = &C_A_D,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#ifdef CONFIG_FUNCTION_TRACER
|
|
{
|
|
.procname = "ftrace_enabled",
|
|
.data = &ftrace_enabled,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = ftrace_enable_sysctl,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_STACK_TRACER
|
|
{
|
|
.procname = "stack_tracer_enabled",
|
|
.data = &stack_tracer_enabled,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = stack_trace_sysctl,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_TRACING
|
|
{
|
|
.procname = "ftrace_dump_on_oops",
|
|
.data = &ftrace_dump_on_oops,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_MODULES
|
|
{
|
|
.procname = "modprobe",
|
|
.data = &modprobe_path,
|
|
.maxlen = KMOD_PATH_LEN,
|
|
.mode = 0644,
|
|
.proc_handler = proc_dostring,
|
|
},
|
|
{
|
|
.procname = "modules_disabled",
|
|
.data = &modules_disabled,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
/* only handle a transition from default "0" to "1" */
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &one,
|
|
.extra2 = &one,
|
|
},
|
|
#endif
|
|
|
|
{
|
|
.procname = "hotplug",
|
|
.data = &uevent_helper,
|
|
.maxlen = UEVENT_HELPER_PATH_LEN,
|
|
.mode = 0644,
|
|
.proc_handler = proc_dostring,
|
|
},
|
|
|
|
#ifdef CONFIG_CHR_DEV_SG
|
|
{
|
|
.procname = "sg-big-buff",
|
|
.data = &sg_big_buff,
|
|
.maxlen = sizeof (int),
|
|
.mode = 0444,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_BSD_PROCESS_ACCT
|
|
{
|
|
.procname = "acct",
|
|
.data = &acct_parm,
|
|
.maxlen = 3*sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_MAGIC_SYSRQ
|
|
{
|
|
.procname = "sysrq",
|
|
.data = &__sysrq_enabled,
|
|
.maxlen = sizeof (int),
|
|
.mode = 0644,
|
|
.proc_handler = sysrq_sysctl_handler,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_PROC_SYSCTL
|
|
{
|
|
.procname = "cad_pid",
|
|
.data = NULL,
|
|
.maxlen = sizeof (int),
|
|
.mode = 0600,
|
|
.proc_handler = proc_do_cad_pid,
|
|
},
|
|
#endif
|
|
{
|
|
.procname = "threads-max",
|
|
.data = &max_threads,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "random",
|
|
.mode = 0555,
|
|
.child = random_table,
|
|
},
|
|
{
|
|
.procname = "usermodehelper",
|
|
.mode = 0555,
|
|
.child = usermodehelper_table,
|
|
},
|
|
{
|
|
.procname = "overflowuid",
|
|
.data = &overflowuid,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &minolduid,
|
|
.extra2 = &maxolduid,
|
|
},
|
|
{
|
|
.procname = "overflowgid",
|
|
.data = &overflowgid,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &minolduid,
|
|
.extra2 = &maxolduid,
|
|
},
|
|
#ifdef CONFIG_S390
|
|
#ifdef CONFIG_MATHEMU
|
|
{
|
|
.procname = "ieee_emulation_warnings",
|
|
.data = &sysctl_ieee_emulation_warnings,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
{
|
|
.procname = "userprocess_debug",
|
|
.data = &show_unhandled_signals,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
{
|
|
.procname = "pid_max",
|
|
.data = &pid_max,
|
|
.maxlen = sizeof (int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &pid_max_min,
|
|
.extra2 = &pid_max_max,
|
|
},
|
|
{
|
|
.procname = "panic_on_oops",
|
|
.data = &panic_on_oops,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#if defined CONFIG_PRINTK
|
|
{
|
|
.procname = "printk",
|
|
.data = &console_loglevel,
|
|
.maxlen = 4*sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "printk_ratelimit",
|
|
.data = &printk_ratelimit_state.interval,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_jiffies,
|
|
},
|
|
{
|
|
.procname = "printk_ratelimit_burst",
|
|
.data = &printk_ratelimit_state.burst,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "printk_delay",
|
|
.data = &printk_delay_msec,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &zero,
|
|
.extra2 = &ten_thousand,
|
|
},
|
|
{
|
|
.procname = "dmesg_restrict",
|
|
.data = &dmesg_restrict,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax_sysadmin,
|
|
.extra1 = &zero,
|
|
.extra2 = &one,
|
|
},
|
|
{
|
|
.procname = "kptr_restrict",
|
|
.data = &kptr_restrict,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax_sysadmin,
|
|
.extra1 = &zero,
|
|
.extra2 = &two,
|
|
},
|
|
#endif
|
|
{
|
|
.procname = "ngroups_max",
|
|
.data = &ngroups_max,
|
|
.maxlen = sizeof (int),
|
|
.mode = 0444,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "cap_last_cap",
|
|
.data = (void *)&cap_last_cap,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0444,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#if defined(CONFIG_LOCKUP_DETECTOR)
|
|
{
|
|
.procname = "watchdog",
|
|
.data = &watchdog_enabled,
|
|
.maxlen = sizeof (int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dowatchdog,
|
|
.extra1 = &zero,
|
|
.extra2 = &one,
|
|
},
|
|
{
|
|
.procname = "watchdog_thresh",
|
|
.data = &watchdog_thresh,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dowatchdog,
|
|
.extra1 = &neg_one,
|
|
.extra2 = &sixty,
|
|
},
|
|
{
|
|
.procname = "softlockup_panic",
|
|
.data = &softlockup_panic,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &zero,
|
|
.extra2 = &one,
|
|
},
|
|
{
|
|
.procname = "nmi_watchdog",
|
|
.data = &watchdog_enabled,
|
|
.maxlen = sizeof (int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dowatchdog,
|
|
.extra1 = &zero,
|
|
.extra2 = &one,
|
|
},
|
|
#endif
|
|
#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86)
|
|
{
|
|
.procname = "unknown_nmi_panic",
|
|
.data = &unknown_nmi_panic,
|
|
.maxlen = sizeof (int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
#if defined(CONFIG_X86)
|
|
{
|
|
.procname = "panic_on_unrecovered_nmi",
|
|
.data = &panic_on_unrecovered_nmi,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "panic_on_io_nmi",
|
|
.data = &panic_on_io_nmi,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#ifdef CONFIG_DEBUG_STACKOVERFLOW
|
|
{
|
|
.procname = "panic_on_stackoverflow",
|
|
.data = &sysctl_panic_on_stackoverflow,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
{
|
|
.procname = "bootloader_type",
|
|
.data = &bootloader_type,
|
|
.maxlen = sizeof (int),
|
|
.mode = 0444,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "bootloader_version",
|
|
.data = &bootloader_version,
|
|
.maxlen = sizeof (int),
|
|
.mode = 0444,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "kstack_depth_to_print",
|
|
.data = &kstack_depth_to_print,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "io_delay_type",
|
|
.data = &io_delay_type,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
#if defined(CONFIG_MMU)
|
|
{
|
|
.procname = "randomize_va_space",
|
|
.data = &randomize_va_space,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
#if defined(CONFIG_S390) && defined(CONFIG_SMP)
|
|
{
|
|
.procname = "spin_retry",
|
|
.data = &spin_retry,
|
|
.maxlen = sizeof (int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
#if defined(CONFIG_ACPI_SLEEP) && defined(CONFIG_X86)
|
|
{
|
|
.procname = "acpi_video_flags",
|
|
.data = &acpi_realmode_flags,
|
|
.maxlen = sizeof (unsigned long),
|
|
.mode = 0644,
|
|
.proc_handler = proc_doulongvec_minmax,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_SYSCTL_ARCH_UNALIGN_NO_WARN
|
|
{
|
|
.procname = "ignore-unaligned-usertrap",
|
|
.data = &no_unaligned_warning,
|
|
.maxlen = sizeof (int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_IA64
|
|
{
|
|
.procname = "unaligned-dump-stack",
|
|
.data = &unaligned_dump_stack,
|
|
.maxlen = sizeof (int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_DETECT_HUNG_TASK
|
|
{
|
|
.procname = "hung_task_panic",
|
|
.data = &sysctl_hung_task_panic,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &zero,
|
|
.extra2 = &one,
|
|
},
|
|
{
|
|
.procname = "hung_task_check_count",
|
|
.data = &sysctl_hung_task_check_count,
|
|
.maxlen = sizeof(unsigned long),
|
|
.mode = 0644,
|
|
.proc_handler = proc_doulongvec_minmax,
|
|
},
|
|
{
|
|
.procname = "hung_task_timeout_secs",
|
|
.data = &sysctl_hung_task_timeout_secs,
|
|
.maxlen = sizeof(unsigned long),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dohung_task_timeout_secs,
|
|
},
|
|
{
|
|
.procname = "hung_task_warnings",
|
|
.data = &sysctl_hung_task_warnings,
|
|
.maxlen = sizeof(unsigned long),
|
|
.mode = 0644,
|
|
.proc_handler = proc_doulongvec_minmax,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_COMPAT
|
|
{
|
|
.procname = "compat-log",
|
|
.data = &compat_log,
|
|
.maxlen = sizeof (int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_RT_MUTEXES
|
|
{
|
|
.procname = "max_lock_depth",
|
|
.data = &max_lock_depth,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
{
|
|
.procname = "poweroff_cmd",
|
|
.data = &poweroff_cmd,
|
|
.maxlen = POWEROFF_CMD_PATH_LEN,
|
|
.mode = 0644,
|
|
.proc_handler = proc_dostring,
|
|
},
|
|
#ifdef CONFIG_KEYS
|
|
{
|
|
.procname = "keys",
|
|
.mode = 0555,
|
|
.child = key_sysctls,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_RCU_TORTURE_TEST
|
|
{
|
|
.procname = "rcutorture_runnable",
|
|
.data = &rcutorture_runnable,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_PERF_EVENTS
|
|
/*
|
|
* User-space scripts rely on the existence of this file
|
|
* as a feature check for perf_events being enabled.
|
|
*
|
|
* So it's an ABI, do not remove!
|
|
*/
|
|
{
|
|
.procname = "perf_event_paranoid",
|
|
.data = &sysctl_perf_event_paranoid,
|
|
.maxlen = sizeof(sysctl_perf_event_paranoid),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "perf_event_mlock_kb",
|
|
.data = &sysctl_perf_event_mlock,
|
|
.maxlen = sizeof(sysctl_perf_event_mlock),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "perf_event_max_sample_rate",
|
|
.data = &sysctl_perf_event_sample_rate,
|
|
.maxlen = sizeof(sysctl_perf_event_sample_rate),
|
|
.mode = 0644,
|
|
.proc_handler = perf_proc_update_handler,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_KMEMCHECK
|
|
{
|
|
.procname = "kmemcheck",
|
|
.data = &kmemcheck_enabled,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_BLOCK
|
|
{
|
|
.procname = "blk_iopoll",
|
|
.data = &blk_iopoll_enabled,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
{ }
|
|
};
|
|
|
|
static struct ctl_table vm_table[] = {
|
|
{
|
|
.procname = "overcommit_memory",
|
|
.data = &sysctl_overcommit_memory,
|
|
.maxlen = sizeof(sysctl_overcommit_memory),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &zero,
|
|
.extra2 = &two,
|
|
},
|
|
{
|
|
.procname = "panic_on_oom",
|
|
.data = &sysctl_panic_on_oom,
|
|
.maxlen = sizeof(sysctl_panic_on_oom),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &zero,
|
|
.extra2 = &two,
|
|
},
|
|
{
|
|
.procname = "oom_kill_allocating_task",
|
|
.data = &sysctl_oom_kill_allocating_task,
|
|
.maxlen = sizeof(sysctl_oom_kill_allocating_task),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "oom_dump_tasks",
|
|
.data = &sysctl_oom_dump_tasks,
|
|
.maxlen = sizeof(sysctl_oom_dump_tasks),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "overcommit_ratio",
|
|
.data = &sysctl_overcommit_ratio,
|
|
.maxlen = sizeof(sysctl_overcommit_ratio),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "page-cluster",
|
|
.data = &page_cluster,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &zero,
|
|
},
|
|
{
|
|
.procname = "dirty_background_ratio",
|
|
.data = &dirty_background_ratio,
|
|
.maxlen = sizeof(dirty_background_ratio),
|
|
.mode = 0644,
|
|
.proc_handler = dirty_background_ratio_handler,
|
|
.extra1 = &zero,
|
|
.extra2 = &one_hundred,
|
|
},
|
|
{
|
|
.procname = "dirty_background_bytes",
|
|
.data = &dirty_background_bytes,
|
|
.maxlen = sizeof(dirty_background_bytes),
|
|
.mode = 0644,
|
|
.proc_handler = dirty_background_bytes_handler,
|
|
.extra1 = &one_ul,
|
|
},
|
|
{
|
|
.procname = "dirty_ratio",
|
|
.data = &vm_dirty_ratio,
|
|
.maxlen = sizeof(vm_dirty_ratio),
|
|
.mode = 0644,
|
|
.proc_handler = dirty_ratio_handler,
|
|
.extra1 = &zero,
|
|
.extra2 = &one_hundred,
|
|
},
|
|
{
|
|
.procname = "dirty_bytes",
|
|
.data = &vm_dirty_bytes,
|
|
.maxlen = sizeof(vm_dirty_bytes),
|
|
.mode = 0644,
|
|
.proc_handler = dirty_bytes_handler,
|
|
.extra1 = &dirty_bytes_min,
|
|
},
|
|
{
|
|
.procname = "dirty_writeback_centisecs",
|
|
.data = &dirty_writeback_interval,
|
|
.maxlen = sizeof(dirty_writeback_interval),
|
|
.mode = 0644,
|
|
.proc_handler = dirty_writeback_centisecs_handler,
|
|
},
|
|
{
|
|
.procname = "dirty_expire_centisecs",
|
|
.data = &dirty_expire_interval,
|
|
.maxlen = sizeof(dirty_expire_interval),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &zero,
|
|
},
|
|
{
|
|
.procname = "nr_pdflush_threads",
|
|
.mode = 0444 /* read-only */,
|
|
.proc_handler = pdflush_proc_obsolete,
|
|
},
|
|
{
|
|
.procname = "swappiness",
|
|
.data = &vm_swappiness,
|
|
.maxlen = sizeof(vm_swappiness),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &zero,
|
|
.extra2 = &one_hundred,
|
|
},
|
|
#ifdef CONFIG_HUGETLB_PAGE
|
|
{
|
|
.procname = "nr_hugepages",
|
|
.data = NULL,
|
|
.maxlen = sizeof(unsigned long),
|
|
.mode = 0644,
|
|
.proc_handler = hugetlb_sysctl_handler,
|
|
.extra1 = (void *)&hugetlb_zero,
|
|
.extra2 = (void *)&hugetlb_infinity,
|
|
},
|
|
#ifdef CONFIG_NUMA
|
|
{
|
|
.procname = "nr_hugepages_mempolicy",
|
|
.data = NULL,
|
|
.maxlen = sizeof(unsigned long),
|
|
.mode = 0644,
|
|
.proc_handler = &hugetlb_mempolicy_sysctl_handler,
|
|
.extra1 = (void *)&hugetlb_zero,
|
|
.extra2 = (void *)&hugetlb_infinity,
|
|
},
|
|
#endif
|
|
{
|
|
.procname = "hugetlb_shm_group",
|
|
.data = &sysctl_hugetlb_shm_group,
|
|
.maxlen = sizeof(gid_t),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
{
|
|
.procname = "hugepages_treat_as_movable",
|
|
.data = &hugepages_treat_as_movable,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = hugetlb_treat_movable_handler,
|
|
},
|
|
{
|
|
.procname = "nr_overcommit_hugepages",
|
|
.data = NULL,
|
|
.maxlen = sizeof(unsigned long),
|
|
.mode = 0644,
|
|
.proc_handler = hugetlb_overcommit_handler,
|
|
.extra1 = (void *)&hugetlb_zero,
|
|
.extra2 = (void *)&hugetlb_infinity,
|
|
},
|
|
#endif
|
|
{
|
|
.procname = "lowmem_reserve_ratio",
|
|
.data = &sysctl_lowmem_reserve_ratio,
|
|
.maxlen = sizeof(sysctl_lowmem_reserve_ratio),
|
|
.mode = 0644,
|
|
.proc_handler = lowmem_reserve_ratio_sysctl_handler,
|
|
},
|
|
{
|
|
.procname = "drop_caches",
|
|
.data = &sysctl_drop_caches,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = drop_caches_sysctl_handler,
|
|
.extra1 = &one,
|
|
.extra2 = &three,
|
|
},
|
|
#ifdef CONFIG_COMPACTION
|
|
{
|
|
.procname = "compact_memory",
|
|
.data = &sysctl_compact_memory,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0200,
|
|
.proc_handler = sysctl_compaction_handler,
|
|
},
|
|
{
|
|
.procname = "extfrag_threshold",
|
|
.data = &sysctl_extfrag_threshold,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = sysctl_extfrag_handler,
|
|
.extra1 = &min_extfrag_threshold,
|
|
.extra2 = &max_extfrag_threshold,
|
|
},
|
|
|
|
#endif /* CONFIG_COMPACTION */
|
|
{
|
|
.procname = "min_free_kbytes",
|
|
.data = &min_free_kbytes,
|
|
.maxlen = sizeof(min_free_kbytes),
|
|
.mode = 0644,
|
|
.proc_handler = min_free_kbytes_sysctl_handler,
|
|
.extra1 = &zero,
|
|
},
|
|
{
|
|
.procname = "percpu_pagelist_fraction",
|
|
.data = &percpu_pagelist_fraction,
|
|
.maxlen = sizeof(percpu_pagelist_fraction),
|
|
.mode = 0644,
|
|
.proc_handler = percpu_pagelist_fraction_sysctl_handler,
|
|
.extra1 = &min_percpu_pagelist_fract,
|
|
},
|
|
#ifdef CONFIG_MMU
|
|
{
|
|
.procname = "max_map_count",
|
|
.data = &sysctl_max_map_count,
|
|
.maxlen = sizeof(sysctl_max_map_count),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &zero,
|
|
},
|
|
#else
|
|
{
|
|
.procname = "nr_trim_pages",
|
|
.data = &sysctl_nr_trim_pages,
|
|
.maxlen = sizeof(sysctl_nr_trim_pages),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &zero,
|
|
},
|
|
#endif
|
|
{
|
|
.procname = "laptop_mode",
|
|
.data = &laptop_mode,
|
|
.maxlen = sizeof(laptop_mode),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_jiffies,
|
|
},
|
|
{
|
|
.procname = "block_dump",
|
|
.data = &block_dump,
|
|
.maxlen = sizeof(block_dump),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
.extra1 = &zero,
|
|
},
|
|
{
|
|
.procname = "vfs_cache_pressure",
|
|
.data = &sysctl_vfs_cache_pressure,
|
|
.maxlen = sizeof(sysctl_vfs_cache_pressure),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
.extra1 = &zero,
|
|
},
|
|
#ifdef HAVE_ARCH_PICK_MMAP_LAYOUT
|
|
{
|
|
.procname = "legacy_va_layout",
|
|
.data = &sysctl_legacy_va_layout,
|
|
.maxlen = sizeof(sysctl_legacy_va_layout),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
.extra1 = &zero,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_NUMA
|
|
{
|
|
.procname = "zone_reclaim_mode",
|
|
.data = &zone_reclaim_mode,
|
|
.maxlen = sizeof(zone_reclaim_mode),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
.extra1 = &zero,
|
|
},
|
|
{
|
|
.procname = "min_unmapped_ratio",
|
|
.data = &sysctl_min_unmapped_ratio,
|
|
.maxlen = sizeof(sysctl_min_unmapped_ratio),
|
|
.mode = 0644,
|
|
.proc_handler = sysctl_min_unmapped_ratio_sysctl_handler,
|
|
.extra1 = &zero,
|
|
.extra2 = &one_hundred,
|
|
},
|
|
{
|
|
.procname = "min_slab_ratio",
|
|
.data = &sysctl_min_slab_ratio,
|
|
.maxlen = sizeof(sysctl_min_slab_ratio),
|
|
.mode = 0644,
|
|
.proc_handler = sysctl_min_slab_ratio_sysctl_handler,
|
|
.extra1 = &zero,
|
|
.extra2 = &one_hundred,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_SMP
|
|
{
|
|
.procname = "stat_interval",
|
|
.data = &sysctl_stat_interval,
|
|
.maxlen = sizeof(sysctl_stat_interval),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_jiffies,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_MMU
|
|
{
|
|
.procname = "mmap_min_addr",
|
|
.data = &dac_mmap_min_addr,
|
|
.maxlen = sizeof(unsigned long),
|
|
.mode = 0644,
|
|
.proc_handler = mmap_min_addr_handler,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_NUMA
|
|
{
|
|
.procname = "numa_zonelist_order",
|
|
.data = &numa_zonelist_order,
|
|
.maxlen = NUMA_ZONELIST_ORDER_LEN,
|
|
.mode = 0644,
|
|
.proc_handler = numa_zonelist_order_handler,
|
|
},
|
|
#endif
|
|
#if (defined(CONFIG_X86_32) && !defined(CONFIG_UML))|| \
|
|
(defined(CONFIG_SUPERH) && defined(CONFIG_VSYSCALL))
|
|
{
|
|
.procname = "vdso_enabled",
|
|
.data = &vdso_enabled,
|
|
.maxlen = sizeof(vdso_enabled),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
.extra1 = &zero,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_HIGHMEM
|
|
{
|
|
.procname = "highmem_is_dirtyable",
|
|
.data = &vm_highmem_is_dirtyable,
|
|
.maxlen = sizeof(vm_highmem_is_dirtyable),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &zero,
|
|
.extra2 = &one,
|
|
},
|
|
#endif
|
|
{
|
|
.procname = "scan_unevictable_pages",
|
|
.data = &scan_unevictable_pages,
|
|
.maxlen = sizeof(scan_unevictable_pages),
|
|
.mode = 0644,
|
|
.proc_handler = scan_unevictable_handler,
|
|
},
|
|
#ifdef CONFIG_MEMORY_FAILURE
|
|
{
|
|
.procname = "memory_failure_early_kill",
|
|
.data = &sysctl_memory_failure_early_kill,
|
|
.maxlen = sizeof(sysctl_memory_failure_early_kill),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &zero,
|
|
.extra2 = &one,
|
|
},
|
|
{
|
|
.procname = "memory_failure_recovery",
|
|
.data = &sysctl_memory_failure_recovery,
|
|
.maxlen = sizeof(sysctl_memory_failure_recovery),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &zero,
|
|
.extra2 = &one,
|
|
},
|
|
#endif
|
|
{
|
|
.procname = "user_reserve_kbytes",
|
|
.data = &sysctl_user_reserve_kbytes,
|
|
.maxlen = sizeof(sysctl_user_reserve_kbytes),
|
|
.mode = 0644,
|
|
.proc_handler = proc_doulongvec_minmax,
|
|
},
|
|
{ }
|
|
};
|
|
|
|
#if defined(CONFIG_BINFMT_MISC) || defined(CONFIG_BINFMT_MISC_MODULE)
|
|
static struct ctl_table binfmt_misc_table[] = {
|
|
{ }
|
|
};
|
|
#endif
|
|
|
|
static struct ctl_table fs_table[] = {
|
|
{
|
|
.procname = "inode-nr",
|
|
.data = &inodes_stat,
|
|
.maxlen = 2*sizeof(int),
|
|
.mode = 0444,
|
|
.proc_handler = proc_nr_inodes,
|
|
},
|
|
{
|
|
.procname = "inode-state",
|
|
.data = &inodes_stat,
|
|
.maxlen = 7*sizeof(int),
|
|
.mode = 0444,
|
|
.proc_handler = proc_nr_inodes,
|
|
},
|
|
{
|
|
.procname = "file-nr",
|
|
.data = &files_stat,
|
|
.maxlen = sizeof(files_stat),
|
|
.mode = 0444,
|
|
.proc_handler = proc_nr_files,
|
|
},
|
|
{
|
|
.procname = "file-max",
|
|
.data = &files_stat.max_files,
|
|
.maxlen = sizeof(files_stat.max_files),
|
|
.mode = 0644,
|
|
.proc_handler = proc_doulongvec_minmax,
|
|
},
|
|
{
|
|
.procname = "nr_open",
|
|
.data = &sysctl_nr_open,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &sysctl_nr_open_min,
|
|
.extra2 = &sysctl_nr_open_max,
|
|
},
|
|
{
|
|
.procname = "dentry-state",
|
|
.data = &dentry_stat,
|
|
.maxlen = 6*sizeof(int),
|
|
.mode = 0444,
|
|
.proc_handler = proc_nr_dentry,
|
|
},
|
|
{
|
|
.procname = "overflowuid",
|
|
.data = &fs_overflowuid,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &minolduid,
|
|
.extra2 = &maxolduid,
|
|
},
|
|
{
|
|
.procname = "overflowgid",
|
|
.data = &fs_overflowgid,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &minolduid,
|
|
.extra2 = &maxolduid,
|
|
},
|
|
#ifdef CONFIG_FILE_LOCKING
|
|
{
|
|
.procname = "leases-enable",
|
|
.data = &leases_enable,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_DNOTIFY
|
|
{
|
|
.procname = "dir-notify-enable",
|
|
.data = &dir_notify_enable,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_MMU
|
|
#ifdef CONFIG_FILE_LOCKING
|
|
{
|
|
.procname = "lease-break-time",
|
|
.data = &lease_break_time,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_AIO
|
|
{
|
|
.procname = "aio-nr",
|
|
.data = &aio_nr,
|
|
.maxlen = sizeof(aio_nr),
|
|
.mode = 0444,
|
|
.proc_handler = proc_doulongvec_minmax,
|
|
},
|
|
{
|
|
.procname = "aio-max-nr",
|
|
.data = &aio_max_nr,
|
|
.maxlen = sizeof(aio_max_nr),
|
|
.mode = 0644,
|
|
.proc_handler = proc_doulongvec_minmax,
|
|
},
|
|
#endif /* CONFIG_AIO */
|
|
#ifdef CONFIG_INOTIFY_USER
|
|
{
|
|
.procname = "inotify",
|
|
.mode = 0555,
|
|
.child = inotify_table,
|
|
},
|
|
#endif
|
|
#ifdef CONFIG_EPOLL
|
|
{
|
|
.procname = "epoll",
|
|
.mode = 0555,
|
|
.child = epoll_table,
|
|
},
|
|
#endif
|
|
#endif
|
|
{
|
|
.procname = "protected_symlinks",
|
|
.data = &sysctl_protected_symlinks,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0600,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &zero,
|
|
.extra2 = &one,
|
|
},
|
|
{
|
|
.procname = "protected_hardlinks",
|
|
.data = &sysctl_protected_hardlinks,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0600,
|
|
.proc_handler = proc_dointvec_minmax,
|
|
.extra1 = &zero,
|
|
.extra2 = &one,
|
|
},
|
|
{
|
|
.procname = "suid_dumpable",
|
|
.data = &suid_dumpable,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec_minmax_coredump,
|
|
.extra1 = &zero,
|
|
.extra2 = &two,
|
|
},
|
|
#if defined(CONFIG_BINFMT_MISC) || defined(CONFIG_BINFMT_MISC_MODULE)
|
|
{
|
|
.procname = "binfmt_misc",
|
|
.mode = 0555,
|
|
.child = binfmt_misc_table,
|
|
},
|
|
#endif
|
|
{
|
|
.procname = "pipe-max-size",
|
|
.data = &pipe_max_size,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = &pipe_proc_fn,
|
|
.extra1 = &pipe_min_size,
|
|
},
|
|
{ }
|
|
};
|
|
|
|
static struct ctl_table debug_table[] = {
|
|
#ifdef CONFIG_SYSCTL_EXCEPTION_TRACE
|
|
{
|
|
.procname = "exception-trace",
|
|
.data = &show_unhandled_signals,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_dointvec
|
|
},
|
|
#endif
|
|
#if defined(CONFIG_OPTPROBES)
|
|
{
|
|
.procname = "kprobes-optimization",
|
|
.data = &sysctl_kprobes_optimization,
|
|
.maxlen = sizeof(int),
|
|
.mode = 0644,
|
|
.proc_handler = proc_kprobes_optimization_handler,
|
|
.extra1 = &zero,
|
|
.extra2 = &one,
|
|
},
|
|
#endif
|
|
{ }
|
|
};
|
|
|
|
static struct ctl_table dev_table[] = {
|
|
{ }
|
|
};
|
|
|
|
int __init sysctl_init(void)
|
|
{
|
|
struct ctl_table_header *hdr;
|
|
|
|
hdr = register_sysctl_table(sysctl_base_table);
|
|
kmemleak_not_leak(hdr);
|
|
return 0;
|
|
}
|
|
|
|
#endif /* CONFIG_SYSCTL */
|
|
|
|
/*
|
|
* /proc/sys support
|
|
*/
|
|
|
|
#ifdef CONFIG_PROC_SYSCTL
|
|
|
|
static int _proc_do_string(void* data, int maxlen, int write,
|
|
void __user *buffer,
|
|
size_t *lenp, loff_t *ppos)
|
|
{
|
|
size_t len;
|
|
char __user *p;
|
|
char c;
|
|
|
|
if (!data || !maxlen || !*lenp) {
|
|
*lenp = 0;
|
|
return 0;
|
|
}
|
|
|
|
if (write) {
|
|
len = 0;
|
|
p = buffer;
|
|
while (len < *lenp) {
|
|
if (get_user(c, p++))
|
|
return -EFAULT;
|
|
if (c == 0 || c == '\n')
|
|
break;
|
|
len++;
|
|
}
|
|
if (len >= maxlen)
|
|
len = maxlen-1;
|
|
if(copy_from_user(data, buffer, len))
|
|
return -EFAULT;
|
|
((char *) data)[len] = 0;
|
|
*ppos += *lenp;
|
|
} else {
|
|
len = strlen(data);
|
|
if (len > maxlen)
|
|
len = maxlen;
|
|
|
|
if (*ppos > len) {
|
|
*lenp = 0;
|
|
return 0;
|
|
}
|
|
|
|
data += *ppos;
|
|
len -= *ppos;
|
|
|
|
if (len > *lenp)
|
|
len = *lenp;
|
|
if (len)
|
|
if(copy_to_user(buffer, data, len))
|
|
return -EFAULT;
|
|
if (len < *lenp) {
|
|
if(put_user('\n', ((char __user *) buffer) + len))
|
|
return -EFAULT;
|
|
len++;
|
|
}
|
|
*lenp = len;
|
|
*ppos += len;
|
|
}
|
|
return 0;
|
|
}
|
|
|
|
/**
|
|
* proc_dostring - read a string sysctl
|
|
* @table: the sysctl table
|
|
* @write: %TRUE if this is a write to the sysctl file
|
|
* @buffer: the user buffer
|
|
* @lenp: the size of the user buffer
|
|
* @ppos: file position
|
|
*
|
|
* Reads/writes a string from/to the user buffer. If the kernel
|
|
* buffer provided is not large enough to hold the string, the
|
|
* string is truncated. The copied string is %NULL-terminated.
|
|
* If the string is being read by the user process, it is copied
|
|
* and a newline '\n' is added. It is truncated if the buffer is
|
|
* not large enough.
|
|
*
|
|
* Returns 0 on success.
|
|
*/
|
|
int proc_dostring(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos)
|
|
{
|
|
return _proc_do_string(table->data, table->maxlen, write,
|
|
buffer, lenp, ppos);
|
|
}
|
|
|
|
static size_t proc_skip_spaces(char **buf)
|
|
{
|
|
size_t ret;
|
|
char *tmp = skip_spaces(*buf);
|
|
ret = tmp - *buf;
|
|
*buf = tmp;
|
|
return ret;
|
|
}
|
|
|
|
static void proc_skip_char(char **buf, size_t *size, const char v)
|
|
{
|
|
while (*size) {
|
|
if (**buf != v)
|
|
break;
|
|
(*size)--;
|
|
(*buf)++;
|
|
}
|
|
}
|
|
|
|
#define TMPBUFLEN 22
|
|
/**
|
|
* proc_get_long - reads an ASCII formatted integer from a user buffer
|
|
*
|
|
* @buf: a kernel buffer
|
|
* @size: size of the kernel buffer
|
|
* @val: this is where the number will be stored
|
|
* @neg: set to %TRUE if number is negative
|
|
* @perm_tr: a vector which contains the allowed trailers
|
|
* @perm_tr_len: size of the perm_tr vector
|
|
* @tr: pointer to store the trailer character
|
|
*
|
|
* In case of success %0 is returned and @buf and @size are updated with
|
|
* the amount of bytes read. If @tr is non-NULL and a trailing
|
|
* character exists (size is non-zero after returning from this
|
|
* function), @tr is updated with the trailing character.
|
|
*/
|
|
static int proc_get_long(char **buf, size_t *size,
|
|
unsigned long *val, bool *neg,
|
|
const char *perm_tr, unsigned perm_tr_len, char *tr)
|
|
{
|
|
int len;
|
|
char *p, tmp[TMPBUFLEN];
|
|
|
|
if (!*size)
|
|
return -EINVAL;
|
|
|
|
len = *size;
|
|
if (len > TMPBUFLEN - 1)
|
|
len = TMPBUFLEN - 1;
|
|
|
|
memcpy(tmp, *buf, len);
|
|
|
|
tmp[len] = 0;
|
|
p = tmp;
|
|
if (*p == '-' && *size > 1) {
|
|
*neg = true;
|
|
p++;
|
|
} else
|
|
*neg = false;
|
|
if (!isdigit(*p))
|
|
return -EINVAL;
|
|
|
|
*val = simple_strtoul(p, &p, 0);
|
|
|
|
len = p - tmp;
|
|
|
|
/* We don't know if the next char is whitespace thus we may accept
|
|
* invalid integers (e.g. 1234...a) or two integers instead of one
|
|
* (e.g. 123...1). So lets not allow such large numbers. */
|
|
if (len == TMPBUFLEN - 1)
|
|
return -EINVAL;
|
|
|
|
if (len < *size && perm_tr_len && !memchr(perm_tr, *p, perm_tr_len))
|
|
return -EINVAL;
|
|
|
|
if (tr && (len < *size))
|
|
*tr = *p;
|
|
|
|
*buf += len;
|
|
*size -= len;
|
|
|
|
return 0;
|
|
}
|
|
|
|
/**
|
|
* proc_put_long - converts an integer to a decimal ASCII formatted string
|
|
*
|
|
* @buf: the user buffer
|
|
* @size: the size of the user buffer
|
|
* @val: the integer to be converted
|
|
* @neg: sign of the number, %TRUE for negative
|
|
*
|
|
* In case of success %0 is returned and @buf and @size are updated with
|
|
* the amount of bytes written.
|
|
*/
|
|
static int proc_put_long(void __user **buf, size_t *size, unsigned long val,
|
|
bool neg)
|
|
{
|
|
int len;
|
|
char tmp[TMPBUFLEN], *p = tmp;
|
|
|
|
sprintf(p, "%s%lu", neg ? "-" : "", val);
|
|
len = strlen(tmp);
|
|
if (len > *size)
|
|
len = *size;
|
|
if (copy_to_user(*buf, tmp, len))
|
|
return -EFAULT;
|
|
*size -= len;
|
|
*buf += len;
|
|
return 0;
|
|
}
|
|
#undef TMPBUFLEN
|
|
|
|
static int proc_put_char(void __user **buf, size_t *size, char c)
|
|
{
|
|
if (*size) {
|
|
char __user **buffer = (char __user **)buf;
|
|
if (put_user(c, *buffer))
|
|
return -EFAULT;
|
|
(*size)--, (*buffer)++;
|
|
*buf = *buffer;
|
|
}
|
|
return 0;
|
|
}
|
|
|
|
static int do_proc_dointvec_conv(bool *negp, unsigned long *lvalp,
|
|
int *valp,
|
|
int write, void *data)
|
|
{
|
|
if (write) {
|
|
*valp = *negp ? -*lvalp : *lvalp;
|
|
} else {
|
|
int val = *valp;
|
|
if (val < 0) {
|
|
*negp = true;
|
|
*lvalp = (unsigned long)-val;
|
|
} else {
|
|
*negp = false;
|
|
*lvalp = (unsigned long)val;
|
|
}
|
|
}
|
|
return 0;
|
|
}
|
|
|
|
static const char proc_wspace_sep[] = { ' ', '\t', '\n' };
|
|
|
|
static int __do_proc_dointvec(void *tbl_data, struct ctl_table *table,
|
|
int write, void __user *buffer,
|
|
size_t *lenp, loff_t *ppos,
|
|
int (*conv)(bool *negp, unsigned long *lvalp, int *valp,
|
|
int write, void *data),
|
|
void *data)
|
|
{
|
|
int *i, vleft, first = 1, err = 0;
|
|
unsigned long page = 0;
|
|
size_t left;
|
|
char *kbuf;
|
|
|
|
if (!tbl_data || !table->maxlen || !*lenp || (*ppos && !write)) {
|
|
*lenp = 0;
|
|
return 0;
|
|
}
|
|
|
|
i = (int *) tbl_data;
|
|
vleft = table->maxlen / sizeof(*i);
|
|
left = *lenp;
|
|
|
|
if (!conv)
|
|
conv = do_proc_dointvec_conv;
|
|
|
|
if (write) {
|
|
if (left > PAGE_SIZE - 1)
|
|
left = PAGE_SIZE - 1;
|
|
page = __get_free_page(GFP_TEMPORARY);
|
|
kbuf = (char *) page;
|
|
if (!kbuf)
|
|
return -ENOMEM;
|
|
if (copy_from_user(kbuf, buffer, left)) {
|
|
err = -EFAULT;
|
|
goto free;
|
|
}
|
|
kbuf[left] = 0;
|
|
}
|
|
|
|
for (; left && vleft--; i++, first=0) {
|
|
unsigned long lval;
|
|
bool neg;
|
|
|
|
if (write) {
|
|
left -= proc_skip_spaces(&kbuf);
|
|
|
|
if (!left)
|
|
break;
|
|
err = proc_get_long(&kbuf, &left, &lval, &neg,
|
|
proc_wspace_sep,
|
|
sizeof(proc_wspace_sep), NULL);
|
|
if (err)
|
|
break;
|
|
if (conv(&neg, &lval, i, 1, data)) {
|
|
err = -EINVAL;
|
|
break;
|
|
}
|
|
} else {
|
|
if (conv(&neg, &lval, i, 0, data)) {
|
|
err = -EINVAL;
|
|
break;
|
|
}
|
|
if (!first)
|
|
err = proc_put_char(&buffer, &left, '\t');
|
|
if (err)
|
|
break;
|
|
err = proc_put_long(&buffer, &left, lval, neg);
|
|
if (err)
|
|
break;
|
|
}
|
|
}
|
|
|
|
if (!write && !first && left && !err)
|
|
err = proc_put_char(&buffer, &left, '\n');
|
|
if (write && !err && left)
|
|
left -= proc_skip_spaces(&kbuf);
|
|
free:
|
|
if (write) {
|
|
free_page(page);
|
|
if (first)
|
|
return err ? : -EINVAL;
|
|
}
|
|
*lenp -= left;
|
|
*ppos += *lenp;
|
|
return err;
|
|
}
|
|
|
|
static int do_proc_dointvec(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos,
|
|
int (*conv)(bool *negp, unsigned long *lvalp, int *valp,
|
|
int write, void *data),
|
|
void *data)
|
|
{
|
|
return __do_proc_dointvec(table->data, table, write,
|
|
buffer, lenp, ppos, conv, data);
|
|
}
|
|
|
|
/**
|
|
* proc_dointvec - read a vector of integers
|
|
* @table: the sysctl table
|
|
* @write: %TRUE if this is a write to the sysctl file
|
|
* @buffer: the user buffer
|
|
* @lenp: the size of the user buffer
|
|
* @ppos: file position
|
|
*
|
|
* Reads/writes up to table->maxlen/sizeof(unsigned int) integer
|
|
* values from/to the user buffer, treated as an ASCII string.
|
|
*
|
|
* Returns 0 on success.
|
|
*/
|
|
int proc_dointvec(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos)
|
|
{
|
|
return do_proc_dointvec(table,write,buffer,lenp,ppos,
|
|
NULL,NULL);
|
|
}
|
|
|
|
/*
|
|
* Taint values can only be increased
|
|
* This means we can safely use a temporary.
|
|
*/
|
|
static int proc_taint(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos)
|
|
{
|
|
struct ctl_table t;
|
|
unsigned long tmptaint = get_taint();
|
|
int err;
|
|
|
|
if (write && !capable(CAP_SYS_ADMIN))
|
|
return -EPERM;
|
|
|
|
t = *table;
|
|
t.data = &tmptaint;
|
|
err = proc_doulongvec_minmax(&t, write, buffer, lenp, ppos);
|
|
if (err < 0)
|
|
return err;
|
|
|
|
if (write) {
|
|
/*
|
|
* Poor man's atomic or. Not worth adding a primitive
|
|
* to everyone's atomic.h for this
|
|
*/
|
|
int i;
|
|
for (i = 0; i < BITS_PER_LONG && tmptaint >> i; i++) {
|
|
if ((tmptaint >> i) & 1)
|
|
add_taint(i, LOCKDEP_STILL_OK);
|
|
}
|
|
}
|
|
|
|
return err;
|
|
}
|
|
|
|
#ifdef CONFIG_PRINTK
|
|
static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos)
|
|
{
|
|
if (write && !capable(CAP_SYS_ADMIN))
|
|
return -EPERM;
|
|
|
|
return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
|
|
}
|
|
#endif
|
|
|
|
struct do_proc_dointvec_minmax_conv_param {
|
|
int *min;
|
|
int *max;
|
|
};
|
|
|
|
static int do_proc_dointvec_minmax_conv(bool *negp, unsigned long *lvalp,
|
|
int *valp,
|
|
int write, void *data)
|
|
{
|
|
struct do_proc_dointvec_minmax_conv_param *param = data;
|
|
if (write) {
|
|
int val = *negp ? -*lvalp : *lvalp;
|
|
if ((param->min && *param->min > val) ||
|
|
(param->max && *param->max < val))
|
|
return -EINVAL;
|
|
*valp = val;
|
|
} else {
|
|
int val = *valp;
|
|
if (val < 0) {
|
|
*negp = true;
|
|
*lvalp = (unsigned long)-val;
|
|
} else {
|
|
*negp = false;
|
|
*lvalp = (unsigned long)val;
|
|
}
|
|
}
|
|
return 0;
|
|
}
|
|
|
|
/**
|
|
* proc_dointvec_minmax - read a vector of integers with min/max values
|
|
* @table: the sysctl table
|
|
* @write: %TRUE if this is a write to the sysctl file
|
|
* @buffer: the user buffer
|
|
* @lenp: the size of the user buffer
|
|
* @ppos: file position
|
|
*
|
|
* Reads/writes up to table->maxlen/sizeof(unsigned int) integer
|
|
* values from/to the user buffer, treated as an ASCII string.
|
|
*
|
|
* This routine will ensure the values are within the range specified by
|
|
* table->extra1 (min) and table->extra2 (max).
|
|
*
|
|
* Returns 0 on success.
|
|
*/
|
|
int proc_dointvec_minmax(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos)
|
|
{
|
|
struct do_proc_dointvec_minmax_conv_param param = {
|
|
.min = (int *) table->extra1,
|
|
.max = (int *) table->extra2,
|
|
};
|
|
return do_proc_dointvec(table, write, buffer, lenp, ppos,
|
|
do_proc_dointvec_minmax_conv, ¶m);
|
|
}
|
|
|
|
static void validate_coredump_safety(void)
|
|
{
|
|
#ifdef CONFIG_COREDUMP
|
|
if (suid_dumpable == SUID_DUMP_ROOT &&
|
|
core_pattern[0] != '/' && core_pattern[0] != '|') {
|
|
printk(KERN_WARNING "Unsafe core_pattern used with "\
|
|
"suid_dumpable=2. Pipe handler or fully qualified "\
|
|
"core dump path required.\n");
|
|
}
|
|
#endif
|
|
}
|
|
|
|
static int proc_dointvec_minmax_coredump(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos)
|
|
{
|
|
int error = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
|
|
if (!error)
|
|
validate_coredump_safety();
|
|
return error;
|
|
}
|
|
|
|
#ifdef CONFIG_COREDUMP
|
|
static int proc_dostring_coredump(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos)
|
|
{
|
|
int error = proc_dostring(table, write, buffer, lenp, ppos);
|
|
if (!error)
|
|
validate_coredump_safety();
|
|
return error;
|
|
}
|
|
#endif
|
|
|
|
static int __do_proc_doulongvec_minmax(void *data, struct ctl_table *table, int write,
|
|
void __user *buffer,
|
|
size_t *lenp, loff_t *ppos,
|
|
unsigned long convmul,
|
|
unsigned long convdiv)
|
|
{
|
|
unsigned long *i, *min, *max;
|
|
int vleft, first = 1, err = 0;
|
|
unsigned long page = 0;
|
|
size_t left;
|
|
char *kbuf;
|
|
|
|
if (!data || !table->maxlen || !*lenp || (*ppos && !write)) {
|
|
*lenp = 0;
|
|
return 0;
|
|
}
|
|
|
|
i = (unsigned long *) data;
|
|
min = (unsigned long *) table->extra1;
|
|
max = (unsigned long *) table->extra2;
|
|
vleft = table->maxlen / sizeof(unsigned long);
|
|
left = *lenp;
|
|
|
|
if (write) {
|
|
if (left > PAGE_SIZE - 1)
|
|
left = PAGE_SIZE - 1;
|
|
page = __get_free_page(GFP_TEMPORARY);
|
|
kbuf = (char *) page;
|
|
if (!kbuf)
|
|
return -ENOMEM;
|
|
if (copy_from_user(kbuf, buffer, left)) {
|
|
err = -EFAULT;
|
|
goto free;
|
|
}
|
|
kbuf[left] = 0;
|
|
}
|
|
|
|
for (; left && vleft--; i++, first = 0) {
|
|
unsigned long val;
|
|
|
|
if (write) {
|
|
bool neg;
|
|
|
|
left -= proc_skip_spaces(&kbuf);
|
|
|
|
err = proc_get_long(&kbuf, &left, &val, &neg,
|
|
proc_wspace_sep,
|
|
sizeof(proc_wspace_sep), NULL);
|
|
if (err)
|
|
break;
|
|
if (neg)
|
|
continue;
|
|
if ((min && val < *min) || (max && val > *max))
|
|
continue;
|
|
*i = val;
|
|
} else {
|
|
val = convdiv * (*i) / convmul;
|
|
if (!first)
|
|
err = proc_put_char(&buffer, &left, '\t');
|
|
err = proc_put_long(&buffer, &left, val, false);
|
|
if (err)
|
|
break;
|
|
}
|
|
}
|
|
|
|
if (!write && !first && left && !err)
|
|
err = proc_put_char(&buffer, &left, '\n');
|
|
if (write && !err)
|
|
left -= proc_skip_spaces(&kbuf);
|
|
free:
|
|
if (write) {
|
|
free_page(page);
|
|
if (first)
|
|
return err ? : -EINVAL;
|
|
}
|
|
*lenp -= left;
|
|
*ppos += *lenp;
|
|
return err;
|
|
}
|
|
|
|
static int do_proc_doulongvec_minmax(struct ctl_table *table, int write,
|
|
void __user *buffer,
|
|
size_t *lenp, loff_t *ppos,
|
|
unsigned long convmul,
|
|
unsigned long convdiv)
|
|
{
|
|
return __do_proc_doulongvec_minmax(table->data, table, write,
|
|
buffer, lenp, ppos, convmul, convdiv);
|
|
}
|
|
|
|
/**
|
|
* proc_doulongvec_minmax - read a vector of long integers with min/max values
|
|
* @table: the sysctl table
|
|
* @write: %TRUE if this is a write to the sysctl file
|
|
* @buffer: the user buffer
|
|
* @lenp: the size of the user buffer
|
|
* @ppos: file position
|
|
*
|
|
* Reads/writes up to table->maxlen/sizeof(unsigned long) unsigned long
|
|
* values from/to the user buffer, treated as an ASCII string.
|
|
*
|
|
* This routine will ensure the values are within the range specified by
|
|
* table->extra1 (min) and table->extra2 (max).
|
|
*
|
|
* Returns 0 on success.
|
|
*/
|
|
int proc_doulongvec_minmax(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos)
|
|
{
|
|
return do_proc_doulongvec_minmax(table, write, buffer, lenp, ppos, 1l, 1l);
|
|
}
|
|
|
|
/**
|
|
* proc_doulongvec_ms_jiffies_minmax - read a vector of millisecond values with min/max values
|
|
* @table: the sysctl table
|
|
* @write: %TRUE if this is a write to the sysctl file
|
|
* @buffer: the user buffer
|
|
* @lenp: the size of the user buffer
|
|
* @ppos: file position
|
|
*
|
|
* Reads/writes up to table->maxlen/sizeof(unsigned long) unsigned long
|
|
* values from/to the user buffer, treated as an ASCII string. The values
|
|
* are treated as milliseconds, and converted to jiffies when they are stored.
|
|
*
|
|
* This routine will ensure the values are within the range specified by
|
|
* table->extra1 (min) and table->extra2 (max).
|
|
*
|
|
* Returns 0 on success.
|
|
*/
|
|
int proc_doulongvec_ms_jiffies_minmax(struct ctl_table *table, int write,
|
|
void __user *buffer,
|
|
size_t *lenp, loff_t *ppos)
|
|
{
|
|
return do_proc_doulongvec_minmax(table, write, buffer,
|
|
lenp, ppos, HZ, 1000l);
|
|
}
|
|
|
|
|
|
static int do_proc_dointvec_jiffies_conv(bool *negp, unsigned long *lvalp,
|
|
int *valp,
|
|
int write, void *data)
|
|
{
|
|
if (write) {
|
|
if (*lvalp > LONG_MAX / HZ)
|
|
return 1;
|
|
*valp = *negp ? -(*lvalp*HZ) : (*lvalp*HZ);
|
|
} else {
|
|
int val = *valp;
|
|
unsigned long lval;
|
|
if (val < 0) {
|
|
*negp = true;
|
|
lval = (unsigned long)-val;
|
|
} else {
|
|
*negp = false;
|
|
lval = (unsigned long)val;
|
|
}
|
|
*lvalp = lval / HZ;
|
|
}
|
|
return 0;
|
|
}
|
|
|
|
static int do_proc_dointvec_userhz_jiffies_conv(bool *negp, unsigned long *lvalp,
|
|
int *valp,
|
|
int write, void *data)
|
|
{
|
|
if (write) {
|
|
if (USER_HZ < HZ && *lvalp > (LONG_MAX / HZ) * USER_HZ)
|
|
return 1;
|
|
*valp = clock_t_to_jiffies(*negp ? -*lvalp : *lvalp);
|
|
} else {
|
|
int val = *valp;
|
|
unsigned long lval;
|
|
if (val < 0) {
|
|
*negp = true;
|
|
lval = (unsigned long)-val;
|
|
} else {
|
|
*negp = false;
|
|
lval = (unsigned long)val;
|
|
}
|
|
*lvalp = jiffies_to_clock_t(lval);
|
|
}
|
|
return 0;
|
|
}
|
|
|
|
static int do_proc_dointvec_ms_jiffies_conv(bool *negp, unsigned long *lvalp,
|
|
int *valp,
|
|
int write, void *data)
|
|
{
|
|
if (write) {
|
|
*valp = msecs_to_jiffies(*negp ? -*lvalp : *lvalp);
|
|
} else {
|
|
int val = *valp;
|
|
unsigned long lval;
|
|
if (val < 0) {
|
|
*negp = true;
|
|
lval = (unsigned long)-val;
|
|
} else {
|
|
*negp = false;
|
|
lval = (unsigned long)val;
|
|
}
|
|
*lvalp = jiffies_to_msecs(lval);
|
|
}
|
|
return 0;
|
|
}
|
|
|
|
/**
|
|
* proc_dointvec_jiffies - read a vector of integers as seconds
|
|
* @table: the sysctl table
|
|
* @write: %TRUE if this is a write to the sysctl file
|
|
* @buffer: the user buffer
|
|
* @lenp: the size of the user buffer
|
|
* @ppos: file position
|
|
*
|
|
* Reads/writes up to table->maxlen/sizeof(unsigned int) integer
|
|
* values from/to the user buffer, treated as an ASCII string.
|
|
* The values read are assumed to be in seconds, and are converted into
|
|
* jiffies.
|
|
*
|
|
* Returns 0 on success.
|
|
*/
|
|
int proc_dointvec_jiffies(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos)
|
|
{
|
|
return do_proc_dointvec(table,write,buffer,lenp,ppos,
|
|
do_proc_dointvec_jiffies_conv,NULL);
|
|
}
|
|
|
|
/**
|
|
* proc_dointvec_userhz_jiffies - read a vector of integers as 1/USER_HZ seconds
|
|
* @table: the sysctl table
|
|
* @write: %TRUE if this is a write to the sysctl file
|
|
* @buffer: the user buffer
|
|
* @lenp: the size of the user buffer
|
|
* @ppos: pointer to the file position
|
|
*
|
|
* Reads/writes up to table->maxlen/sizeof(unsigned int) integer
|
|
* values from/to the user buffer, treated as an ASCII string.
|
|
* The values read are assumed to be in 1/USER_HZ seconds, and
|
|
* are converted into jiffies.
|
|
*
|
|
* Returns 0 on success.
|
|
*/
|
|
int proc_dointvec_userhz_jiffies(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos)
|
|
{
|
|
return do_proc_dointvec(table,write,buffer,lenp,ppos,
|
|
do_proc_dointvec_userhz_jiffies_conv,NULL);
|
|
}
|
|
|
|
/**
|
|
* proc_dointvec_ms_jiffies - read a vector of integers as 1 milliseconds
|
|
* @table: the sysctl table
|
|
* @write: %TRUE if this is a write to the sysctl file
|
|
* @buffer: the user buffer
|
|
* @lenp: the size of the user buffer
|
|
* @ppos: file position
|
|
* @ppos: the current position in the file
|
|
*
|
|
* Reads/writes up to table->maxlen/sizeof(unsigned int) integer
|
|
* values from/to the user buffer, treated as an ASCII string.
|
|
* The values read are assumed to be in 1/1000 seconds, and
|
|
* are converted into jiffies.
|
|
*
|
|
* Returns 0 on success.
|
|
*/
|
|
int proc_dointvec_ms_jiffies(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos)
|
|
{
|
|
return do_proc_dointvec(table, write, buffer, lenp, ppos,
|
|
do_proc_dointvec_ms_jiffies_conv, NULL);
|
|
}
|
|
|
|
static int proc_do_cad_pid(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos)
|
|
{
|
|
struct pid *new_pid;
|
|
pid_t tmp;
|
|
int r;
|
|
|
|
tmp = pid_vnr(cad_pid);
|
|
|
|
r = __do_proc_dointvec(&tmp, table, write, buffer,
|
|
lenp, ppos, NULL, NULL);
|
|
if (r || !write)
|
|
return r;
|
|
|
|
new_pid = find_get_pid(tmp);
|
|
if (!new_pid)
|
|
return -ESRCH;
|
|
|
|
put_pid(xchg(&cad_pid, new_pid));
|
|
return 0;
|
|
}
|
|
|
|
/**
|
|
* proc_do_large_bitmap - read/write from/to a large bitmap
|
|
* @table: the sysctl table
|
|
* @write: %TRUE if this is a write to the sysctl file
|
|
* @buffer: the user buffer
|
|
* @lenp: the size of the user buffer
|
|
* @ppos: file position
|
|
*
|
|
* The bitmap is stored at table->data and the bitmap length (in bits)
|
|
* in table->maxlen.
|
|
*
|
|
* We use a range comma separated format (e.g. 1,3-4,10-10) so that
|
|
* large bitmaps may be represented in a compact manner. Writing into
|
|
* the file will clear the bitmap then update it with the given input.
|
|
*
|
|
* Returns 0 on success.
|
|
*/
|
|
int proc_do_large_bitmap(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos)
|
|
{
|
|
int err = 0;
|
|
bool first = 1;
|
|
size_t left = *lenp;
|
|
unsigned long bitmap_len = table->maxlen;
|
|
unsigned long *bitmap = (unsigned long *) table->data;
|
|
unsigned long *tmp_bitmap = NULL;
|
|
char tr_a[] = { '-', ',', '\n' }, tr_b[] = { ',', '\n', 0 }, c;
|
|
|
|
if (!bitmap_len || !left || (*ppos && !write)) {
|
|
*lenp = 0;
|
|
return 0;
|
|
}
|
|
|
|
if (write) {
|
|
unsigned long page = 0;
|
|
char *kbuf;
|
|
|
|
if (left > PAGE_SIZE - 1)
|
|
left = PAGE_SIZE - 1;
|
|
|
|
page = __get_free_page(GFP_TEMPORARY);
|
|
kbuf = (char *) page;
|
|
if (!kbuf)
|
|
return -ENOMEM;
|
|
if (copy_from_user(kbuf, buffer, left)) {
|
|
free_page(page);
|
|
return -EFAULT;
|
|
}
|
|
kbuf[left] = 0;
|
|
|
|
tmp_bitmap = kzalloc(BITS_TO_LONGS(bitmap_len) * sizeof(unsigned long),
|
|
GFP_KERNEL);
|
|
if (!tmp_bitmap) {
|
|
free_page(page);
|
|
return -ENOMEM;
|
|
}
|
|
proc_skip_char(&kbuf, &left, '\n');
|
|
while (!err && left) {
|
|
unsigned long val_a, val_b;
|
|
bool neg;
|
|
|
|
err = proc_get_long(&kbuf, &left, &val_a, &neg, tr_a,
|
|
sizeof(tr_a), &c);
|
|
if (err)
|
|
break;
|
|
if (val_a >= bitmap_len || neg) {
|
|
err = -EINVAL;
|
|
break;
|
|
}
|
|
|
|
val_b = val_a;
|
|
if (left) {
|
|
kbuf++;
|
|
left--;
|
|
}
|
|
|
|
if (c == '-') {
|
|
err = proc_get_long(&kbuf, &left, &val_b,
|
|
&neg, tr_b, sizeof(tr_b),
|
|
&c);
|
|
if (err)
|
|
break;
|
|
if (val_b >= bitmap_len || neg ||
|
|
val_a > val_b) {
|
|
err = -EINVAL;
|
|
break;
|
|
}
|
|
if (left) {
|
|
kbuf++;
|
|
left--;
|
|
}
|
|
}
|
|
|
|
bitmap_set(tmp_bitmap, val_a, val_b - val_a + 1);
|
|
first = 0;
|
|
proc_skip_char(&kbuf, &left, '\n');
|
|
}
|
|
free_page(page);
|
|
} else {
|
|
unsigned long bit_a, bit_b = 0;
|
|
|
|
while (left) {
|
|
bit_a = find_next_bit(bitmap, bitmap_len, bit_b);
|
|
if (bit_a >= bitmap_len)
|
|
break;
|
|
bit_b = find_next_zero_bit(bitmap, bitmap_len,
|
|
bit_a + 1) - 1;
|
|
|
|
if (!first) {
|
|
err = proc_put_char(&buffer, &left, ',');
|
|
if (err)
|
|
break;
|
|
}
|
|
err = proc_put_long(&buffer, &left, bit_a, false);
|
|
if (err)
|
|
break;
|
|
if (bit_a != bit_b) {
|
|
err = proc_put_char(&buffer, &left, '-');
|
|
if (err)
|
|
break;
|
|
err = proc_put_long(&buffer, &left, bit_b, false);
|
|
if (err)
|
|
break;
|
|
}
|
|
|
|
first = 0; bit_b++;
|
|
}
|
|
if (!err)
|
|
err = proc_put_char(&buffer, &left, '\n');
|
|
}
|
|
|
|
if (!err) {
|
|
if (write) {
|
|
if (*ppos)
|
|
bitmap_or(bitmap, bitmap, tmp_bitmap, bitmap_len);
|
|
else
|
|
bitmap_copy(bitmap, tmp_bitmap, bitmap_len);
|
|
}
|
|
kfree(tmp_bitmap);
|
|
*lenp -= left;
|
|
*ppos += *lenp;
|
|
return 0;
|
|
} else {
|
|
kfree(tmp_bitmap);
|
|
return err;
|
|
}
|
|
}
|
|
|
|
#else /* CONFIG_PROC_SYSCTL */
|
|
|
|
int proc_dostring(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos)
|
|
{
|
|
return -ENOSYS;
|
|
}
|
|
|
|
int proc_dointvec(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos)
|
|
{
|
|
return -ENOSYS;
|
|
}
|
|
|
|
int proc_dointvec_minmax(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos)
|
|
{
|
|
return -ENOSYS;
|
|
}
|
|
|
|
int proc_dointvec_jiffies(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos)
|
|
{
|
|
return -ENOSYS;
|
|
}
|
|
|
|
int proc_dointvec_userhz_jiffies(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos)
|
|
{
|
|
return -ENOSYS;
|
|
}
|
|
|
|
int proc_dointvec_ms_jiffies(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos)
|
|
{
|
|
return -ENOSYS;
|
|
}
|
|
|
|
int proc_doulongvec_minmax(struct ctl_table *table, int write,
|
|
void __user *buffer, size_t *lenp, loff_t *ppos)
|
|
{
|
|
return -ENOSYS;
|
|
}
|
|
|
|
int proc_doulongvec_ms_jiffies_minmax(struct ctl_table *table, int write,
|
|
void __user *buffer,
|
|
size_t *lenp, loff_t *ppos)
|
|
{
|
|
return -ENOSYS;
|
|
}
|
|
|
|
|
|
#endif /* CONFIG_PROC_SYSCTL */
|
|
|
|
/*
|
|
* No sense putting this after each symbol definition, twice,
|
|
* exception granted :-)
|
|
*/
|
|
EXPORT_SYMBOL(proc_dointvec);
|
|
EXPORT_SYMBOL(proc_dointvec_jiffies);
|
|
EXPORT_SYMBOL(proc_dointvec_minmax);
|
|
EXPORT_SYMBOL(proc_dointvec_userhz_jiffies);
|
|
EXPORT_SYMBOL(proc_dointvec_ms_jiffies);
|
|
EXPORT_SYMBOL(proc_dostring);
|
|
EXPORT_SYMBOL(proc_doulongvec_minmax);
|
|
EXPORT_SYMBOL(proc_doulongvec_ms_jiffies_minmax);
|