kernel_optimize_test/include/linux
KOSAKI Motohiro 31f1de46b9 mempolicy: silently restrict nodemask to allowed nodes
Kosaki Motohito noted that "numactl --interleave=all ..." failed in the
presence of memoryless nodes.  This patch attempts to fix that problem.

Some background:

numactl --interleave=all calls set_mempolicy(2) with a fully populated
[out to MAXNUMNODES] nodemask.  set_mempolicy() [in do_set_mempolicy()]
calls contextualize_policy() which requires that the nodemask be a
subset of the current task's mems_allowed; else EINVAL will be returned.

A task's mems_allowed will always be a subset of node_states[N_HIGH_MEMORY]
i.e., nodes with memory.  So, a fully populated nodemask will be
declared invalid if it includes memoryless nodes.

  NOTE:  the same thing will occur when running in a cpuset
         with restricted mem_allowed--for the same reason:
         node mask contains dis-allowed nodes.

mbind(2), on the other hand, just masks off any nodes in the nodemask
that are not included in the caller's mems_allowed.

In each case [mbind() and set_mempolicy()], mpol_check_policy() will
complain [again, resulting in EINVAL] if the nodemask contains any
memoryless nodes.  This is somewhat redundant as mpol_new() will remove
memoryless nodes for interleave policy, as will bind_zonelist()--called
by mpol_new() for BIND policy.

Proposed fix:

1) modify contextualize_policy logic to:
   a) remember whether the incoming node mask is empty.
   b) if not, restrict the nodemask to allowed nodes, as is
      currently done in-line for mbind().  This guarantees
      that the resulting mask includes only nodes with memory.

      NOTE:  this is a [benign, IMO] change in behavior for
             set_mempolicy().  Dis-allowed nodes will be
             silently ignored, rather than returning an error.

   c) fold this code into mpol_check_policy(), replace 2 calls to
      contextualize_policy() to call mpol_check_policy() directly
      and remove contextualize_policy().

2) In existing mpol_check_policy() logic, after "contextualization":
   a) MPOL_DEFAULT:  require that in coming mask "was_empty"
   b) MPOL_{BIND|INTERLEAVE}:  require that contextualized nodemask
      contains at least one node.
   c) add a case for MPOL_PREFERRED:  if in coming was not empty
      and resulting mask IS empty, user specified invalid nodes.
      Return EINVAL.
   c) remove the now redundant check for memoryless nodes

3) remove the now redundant masking of policy nodes for interleave
   policy from mpol_new().

4) Now that mpol_check_policy() contextualizes the nodemask, remove
   the in-line nodes_and() from sys_mbind().  I believe that this
   restores mbind() to the behavior before the memoryless-nodes
   patch series.  E.g., we'll no longer treat an invalid nodemask
   with MPOL_PREFERRED as local allocation.

[ Patch history:

  v1 -> v2:
   - Communicate whether or not incoming node mask was empty to
     mpol_check_policy() for better error checking.
   - As suggested by David Rientjes, remove the now unused
     cpuset_nodes_subset_current_mems_allowed() from cpuset.h

  v2 -> v3:
   - As suggested by Kosaki Motohito, fold the "contextualization"
     of policy nodemask into mpol_check_policy().  Looks a little
     cleaner. ]

Signed-off-by:  Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by:  KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Tested-by:      KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by:       David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-11 20:48:29 -08:00
..
amba
byteorder byteorder: move le32_add_cpu & friends from OCFS2 to core 2008-02-08 09:22:32 -08:00
can
dvb
hdlc
i2c
isdn
lockd
mfd
mlx4 IB/mlx4: Use multiple WQ blocks to post smaller send WQEs 2008-02-08 13:30:02 -08:00
mmc
mtd
netfilter
netfilter_arp
netfilter_bridge
netfilter_ipv4
netfilter_ipv6
nfsd
raid
rtc
spi
ssb
sunrpc nfsd: clean up svc_reserve_auth() 2008-02-10 18:11:16 -05:00
tc_act
tc_ematch
usb
8250_pci.h
a.out.h aout: suppress A.OUT library support if !CONFIG_ARCH_SUPPORTS_AOUT 2008-02-08 09:22:30 -08:00
ac97_codec.h
acct.h
acpi_pmtmr.h
acpi.h
adb.h
adfs_fs_i.h
adfs_fs_sb.h
adfs_fs.h
aer.h
affs_hardblocks.h
agp_backend.h
agpgart.h
aio_abi.h
aio.h
amifd.h
amifdreg.h
amigaffs.h
anon_inodes.h
apm_bios.h
apm-emulation.h
arcdevice.h
arcfb.h
async_tx.h
ata_platform.h
ata.h
atalk.h
atm_eni.h
atm_he.h
atm_idt77105.h
atm_nicstar.h
atm_suni.h
atm_tcp.h
atm_zatm.h
atm.h
atmapi.h
atmarp.h
atmbr2684.h
atmclip.h
atmdev.h
atmel_pdc.h
atmel_pwm.h Basic PWM driver for AVR32 and AT91 2008-02-08 09:22:38 -08:00
atmel_serial.h
atmel-ssc.h
atmioc.h
atmlec.h
atmmpc.h
atmppp.h
atmsap.h
atmsvc.h
attribute_container.h
audit.h
auto_fs4.h
auto_fs.h
auxvec.h
ax25.h
b1lli.h
b1pcmcia.h
backing-dev.h
backlight.h
baycom.h
bcd.h
bfs_fs.h
binfmts.h
bio.h
bit_spinlock.h
bitmap.h
bitops.h
bitrev.h
blkdev.h block: fixup rq_init() a bit 2008-02-08 12:41:03 +01:00
blkpg.h
blktrace_api.h
blockgroup_lock.h
bootmem.h
bottom_half.h
bpqether.h
bsg.h
buffer_head.h
bug.h
cache.h
calc64.h
can.h
capability.h
capi.h
cciss_ioctl.h
cd1400.h
cdev.h
cdk.h
cdrom.h
cfag12864b.h
cgroup_subsys.h
cgroup.h
cgroupstats.h
chio.h
circ_buf.h
clk.h
clockchips.h
clocksource.h
cm4000_cs.h
cn_proc.h
coda_cache.h
coda_fs_i.h
coda_linux.h
coda_psdev.h
coda.h
coff.h
com20020.h
compat.h
compiler-gcc3.h
compiler-gcc4.h x86, core: remove CONFIG_FORCED_INLINING 2008-02-09 23:24:09 +01:00
compiler-gcc.h
compiler-intel.h
compiler.h
completion.h
comstats.h
concap.h
configfs.h
connector.h
console_struct.h
console.h
consolemap.h
const.h
cpu.h
cpufreq.h
cpuidle.h
cpumask.h
cpuset.h mempolicy: silently restrict nodemask to allowed nodes 2008-02-11 20:48:29 -08:00
cramfs_fs_sb.h
cramfs_fs.h
crash_dump.h
crc7.h
crc16.h
crc32.h
crc32c.h
crc-ccitt.h
crc-itu-t.h
crypto.h
cryptohash.h
ctype.h
cuda.h
cyclades.h
cyclomx.h
cycx_cfm.h
cycx_drv.h
cycx_x25.h
dca.h DCA: convert struct class_device to struct device. 2008-02-08 15:33:33 -08:00
dcache.h
dccp.h
dcookies.h
debug_locks.h
debugfs.h
delay.h
delayacct.h
device-mapper.h dm: table remove unused variable 2008-02-08 02:10:01 +00:00
device.h
devpts_fs.h
dio.h
dirent.h
display.h
dlm_device.h
dlm_netlink.h
dlm.h
dlmconstants.h
dm9000.h
dm-ioctl.h dm ioctl: move compat code 2008-02-08 02:09:56 +00:00
dma-mapping.h
dmaengine.h
dmapool.h
dmar.h intel-iommu: fault_reason index cleanup 2008-02-08 09:22:24 -08:00
dmi.h SMBIOS/DMI: add type 41 = Onboard Devices Extended Information 2008-02-08 09:22:37 -08:00
dn.h
dnotify.h
dqblk_v1.h
dqblk_v2.h
dqblk_xfs.h
ds1wm.h
ds1286.h
ds17287rtc.h
dtlk.h
edac.h
edd.h
eeprom_93cx6.h
efi.h
efs_dir.h
efs_fs_i.h
efs_fs_sb.h
efs_fs.h
efs_vh.h
eisa.h
elevator.h
elf-em.h mn10300: add the MN10300/AM33 architecture to the kernel 2008-02-08 09:22:30 -08:00
elf-fdpic.h
elf.h
elfcore-compat.h
elfcore.h
elfnote.h
enclosure.h [SCSI] enclosure: add support for enclosure services 2008-02-07 18:04:10 -06:00
err.h
errno.h
errqueue.h
etherdevice.h
ethtool.h
eventfd.h
eventpoll.h
exportfs.h
ext2_fs_sb.h
ext2_fs.h
ext3_fs_i.h
ext3_fs_sb.h
ext3_fs.h
ext3_jbd.h
ext4_fs_extents.h
ext4_fs_i.h
ext4_fs_sb.h
ext4_fs.h ext4: Add new "development flag" to the ext4 filesystem 2008-02-10 01:11:44 -05:00
ext4_jbd2.h
f75375s.h
fadvise.h
falloc.h
fault-inject.h
fb.h
fcdevice.h
fcntl.h
fd1772.h
fd.h
fddidevice.h
fdreg.h
fib_rules.h
file.h
filter.h
firewire-cdev.h
firewire-constants.h
firmware.h
flat.h
font.h
freezer.h
fs_enet_pd.h
fs_stack.h
fs_struct.h
fs_uart_pd.h
fs.h fs/char_dev.c: chrdev_open marked static and removed from fs.h 2008-02-08 09:22:42 -08:00
fsl_devices.h
fsnotify.h
fuse.h
futex.h
gameport.h
gen_stats.h
genalloc.h
generic_acl.h
generic_serial.h
genetlink.h
genhd.h Enhanced partition statistics: remove old partition statistics 2008-02-08 12:42:01 +01:00
getcpu.h
gfp.h
gfs2_ondisk.h
gigaset_dev.h
gpio_keys.h
gpio_mouse.h
hardirq.h
harrier_defs.h
hash.h
hayesesp.h
hdlc.h
hdlcdrv.h
hdpu_features.h
hdreg.h
hdsmart.h
hid-debug.h
hid.h
hiddev.h
hidraw.h
highmem.h
highuid.h
hil_mlc.h
hil.h
hippidevice.h
hp_sdc.h
hpet.h
hrtimer.h hrtimer: fix *rmtp handling in hrtimer_nanosleep() 2008-02-10 10:48:03 +01:00
htirq.h
hugetlb.h hugetlb: add locking for overcommit sysctl 2008-02-08 09:22:23 -08:00
hw_random.h
hwmon-sysfs.h
hwmon-vid.h
hwmon.h
hysdn_if.h
i2c-algo-bit.h
i2c-algo-pca.h
i2c-algo-pcf.h
i2c-algo-sgi.h
i2c-dev.h
i2c-gpio.h
i2c-id.h hwmon: Discard useless I2C driver IDs 2008-02-07 20:39:44 -05:00
i2c-ocores.h
i2c-pnx.h
i2c-pxa.h
i2c.h
i2o-dev.h
i2o.h
i8k.h
i8042.h
ibmtr.h
icmp.h
icmpv6.h
ide.h Prevent IDE boot ops on NUMA system 2008-02-11 09:20:50 -08:00
idr.h
ieee80211.h
if_addr.h
if_addrlabel.h
if_arcnet.h
if_arp.h
if_bonding.h
if_bridge.h
if_cablemodem.h
if_ec.h
if_eql.h
if_ether.h
if_fc.h
if_fddi.h
if_frad.h
if_hippi.h
if_infiniband.h
if_link.h
if_ltalk.h
if_macvlan.h
if_packet.h
if_plip.h
if_ppp.h
if_pppol2tp.h
if_pppox.h
if_slip.h
if_strip.h
if_tr.h
if_tun.h
if_tunnel.h
if_vlan.h
if_wanpipe.h
if.h
igmp.h
in6.h
in_route.h
in.h
inet_diag.h
inet_lro.h
inet.h
inetdevice.h
init_ohci1394_dma.h
init_task.h
init.h
initrd.h
inotify.h
input-polldev.h
input.h
interrupt.h
io.h
ioc3.h
ioc4.h
iocontext.h
ioctl.h
iommu-helper.h
ioport.h
ioprio.h
ip6_tunnel.h
ip.h
ipc_namespace.h IPC: consolidate sem_exit_ns(), msg_exit_ns() and shm_exit_ns() 2008-02-08 09:22:26 -08:00
ipc.h namespaces: move the IPC namespace under IPC_NS option 2008-02-08 09:22:23 -08:00
ipmi_msgdefs.h
ipmi_smi.h
ipmi.h
ipsec.h
ipv6_route.h
ipv6.h
ipx.h
irda.h
irq_cpustat.h
irq.h IRQ_NOPROBE helper functions 2008-02-08 09:22:42 -08:00
irqflags.h
irqreturn.h
isa.h
isapnp.h
isdn_divertif.h
isdn_ppp.h
isdn.h
isdnif.h
isicom.h
iso_fs.h
istallion.h
ivtv.h
ivtvfb.h
ixjuser.h
jbd2.h
jbd.h
jffs2.h
jhash.h
jiffies.h time: fix typo in comments 2008-02-08 09:22:29 -08:00
journal-head.h
joystick.h
kallsyms.h
kbd_diacr.h
kbd_kern.h
Kbuild drop linux/ufs_fs.h from userspace export and relocate it to fs/ufs/ufs_fs.h 2008-02-08 09:22:39 -08:00
kd.h
kdebug.h
kdev_t.h
kernel_stat.h
kernel.h Add new string functions strict_strto* and convert kernel params to use them 2008-02-08 09:22:41 -08:00
kernelcapi.h
kexec.h
key-type.h
key-ui.h
key.h
keyboard.h
keyctl.h
kfifo.h
klist.h
kmalloc_sizes.h
kmod.h
kobj_map.h
kobject.h
kprobes.h
kref.h
ks0108.h
kthread.h
ktime.h
kvm_host.h
kvm_para.h
kvm_types.h
kvm.h
lapb.h
latencytop.h
lcd.h
leds.h
lguest_launcher.h
lguest.h
libata.h
libps2.h
license.h
limits.h
linkage.h
linux_logo.h
list.h
llc.h
lm_interface.h
lock_dlm_plock.h
lockdep.h
log2.h
loop.h
lp.h
lzo.h
m48t86.h
magic.h
major.h
maple.h
marker.h
matroxfb.h
mbcache.h
mc6821.h
mc146818rtc.h
mca-legacy.h
mca.h
mdio-bitbang.h
memcontrol.h memcontrol: add vm_match_cgroup() 2008-02-09 11:08:33 -08:00
memory_hotplug.h
memory.h
mempolicy.h
mempool.h
memstick.h memstick: initial commit for Sony MemoryStick support 2008-02-09 11:08:34 -08:00
meye.h
migrate.h
mii.h
minix_fs.h
miscdevice.h
mm_inline.h
mm_types.h SLUB: Use unique end pointer for each slab page. 2008-02-07 17:47:41 -08:00
mm.h CONFIG_HIGHPTE vs. sub-page page tables. 2008-02-08 09:22:42 -08:00
mman.h
mmtimer.h
mmzone.h
mnt_namespace.h
mod_devicetable.h
module.h fix "modules: make module_address_lookup() safe" 2008-02-08 09:22:24 -08:00
moduleloader.h
moduleparam.h
mount.h
mpage.h
mqueue.h
mroute.h
msdos_fs.h
msg.h
msi.h
mtio.h
mutex-debug.h
mutex.h Remove fastcall from linux/include 2008-02-08 09:22:31 -08:00
mv643xx_eth.h
mv643xx_i2c.h
mv643xx.h
n_r3964.h
namei.h
nbd.h NBD: remove limit on max number of nbd devices 2008-02-08 09:22:41 -08:00
ncp_fs_i.h
ncp_fs_sb.h
ncp_fs.h
ncp_mount.h
ncp_no.h
ncp.h
neighbour.h
net.h
netdevice.h
netfilter_arp.h
netfilter_bridge.h
netfilter_decnet.h
netfilter_ipv4.h
netfilter_ipv6.h
netfilter.h
netlink.h
netpoll.h
netrom.h
nfs2.h
nfs3.h
nfs4_acl.h
nfs4_mount.h
nfs4.h
nfs_fs_i.h
nfs_fs_sb.h
nfs_fs.h
nfs_idmap.h
nfs_mount.h
nfs_page.h
nfs_xdr.h
nfs.h
nfsacl.h
nfsd_idmap.h
nl80211.h
nls.h
nmi.h
node.h
nodemask.h
notifier.h
nsc_gpio.h
nsproxy.h
nubus.h
numa.h
nvram.h
of_device.h
of_platform.h
of.h
oom.h
oprofile.h
page-flags.h
page-isolation.h
pageblock-flags.h
pagemap.h
pagevec.h
param.h
parport_pc.h
parport.h
parser.h
patchkey.h
pci_hotplug.h
pci_ids.h
pci_regs.h
pci-acpi.h
pci.h Change pci_raw_ops to pci_raw_read/write 2008-02-10 12:52:46 -08:00
pcieport_if.h
pcounter.h
pda_power.h
percpu_counter.h
percpu.h
personality.h
pfkeyv2.h [IPSEC]: Add support for aes-ctr. 2008-02-07 23:11:56 -08:00
pfn.h
pg.h
phantom.h
phonedev.h
phy_fixed.h
phy.h
pid_namespace.h namespaces: cleanup the code managed with PID_NS option 2008-02-08 09:22:23 -08:00
pid.h uglify while_each_pid_task() to make sure we don't count the execing pricess twice 2008-02-08 09:22:28 -08:00
pipe_fs_i.h
pkt_cls.h
pkt_sched.h
pktcdvd.h
platform_device.h
plist.h
pm_legacy.h
pm_qos_params.h
pm.h
pmu.h
pnp.h
pnpbios.h
poison.h
poll.h
posix_acl_xattr.h
posix_acl.h
posix_types.h
posix-timers.h
power_supply.h
ppdev.h
ppp_channel.h
ppp_defs.h
ppp-comp.h
prctl.h
preempt.h Remove fastcall from linux/include 2008-02-08 09:22:31 -08:00
prefetch.h
prio_heap.h
prio_tree.h
proc_fs.h proc: fix ->open'less usage due to ->proc_fops flip 2008-02-08 09:22:24 -08:00
profile.h
proportions.h
ps2esdi.h
ptrace.h kill PT_ATTACHED 2008-02-08 09:22:26 -08:00
qnx4_fs.h
qnxtypes.h
quicklist.h
quota.h
quotaio_v1.h
quotaio_v2.h
quotaops.h
radeonfb.h
radix-tree.h
raid_class.h
ramfs.h
random.h
raw.h
rbtree.h
rcuclassic.h
rcupdate.h
rcupreempt_trace.h
rcupreempt.h preemptible RCU: sparse annotations 2008-02-08 09:22:42 -08:00
reboot.h
reciprocal_div.h
regset.h
reiserfs_acl.h
reiserfs_fs_i.h
reiserfs_fs_sb.h
reiserfs_fs.h use __u32 in linux/reiserfs_fs.h 2008-02-08 09:22:41 -08:00
reiserfs_xattr.h
relay.h
res_counter.h
resource.h
resume-trace.h
rfkill.h
rio_drv.h
rio_ids.h
rio_regs.h
rio.h
rmap.h
romfs_fs.h
root_dev.h
rose.h
route.h
rslib.h
rtc-v3020.h
rtc.h
rtmutex.h
rtnetlink.h
rwsem-spinlock.h
rwsem.h
rxrpc.h
sc26198.h
scatterlist.h
scc.h
sched.h Get rid of the kill_pgrp_info() function 2008-02-08 09:22:29 -08:00
screen_info.h
sctp.h
scx200_gpio.h
scx200.h
sdla.h
seccomp.h
securebits.h
security.h
selection.h
selinux_netlink.h
selinux.h
sem.h
seq_file.h
seqlock.h
serial167.h
serial_8250.h
serial_core.h mn10300: allocate serial port UART IDs for on-chip serial ports 2008-02-08 09:22:30 -08:00
serial_pnx8xxx.h
serial_reg.h
serial.h
serialP.h
serio.h
shm.h
shmem_fs.h mount options: fix tmpfs 2008-02-08 09:22:41 -08:00
signal.h fix group stop with exit race 2008-02-08 09:22:27 -08:00
signalfd.h
skbuff.h
slab_def.h
slab.h
slob_def.h
slub_def.h SLUB: Support for performance statistics 2008-02-07 17:47:41 -08:00
sm501-regs.h
sm501.h
smb_fs_i.h
smb_fs_sb.h
smb_fs.h
smb_mount.h
smb.h
smbno.h
smp_lock.h
smp.h
snmp.h
socket.h
sockios.h
som.h
sonet.h
sony-laptop.h
sonypi.h
sort.h
sound.h
soundcard.h
spinlock_api_smp.h
spinlock_api_up.h
spinlock_types_up.h
spinlock_types.h
spinlock_up.h
spinlock.h Remove fastcall from linux/include 2008-02-08 09:22:31 -08:00
splice.h
srcu.h
stacktrace.h
stallion.h
start_kernel.h
stat.h
statfs.h
stddef.h
stop_machine.h
string.h
stringify.h
superhyway.h
suspend_ioctls.h
suspend.h
svga.h
swap.h
swapops.h Fix compile error on nommu for is_swap_pte 2008-02-09 11:08:33 -08:00
synclink.h
sys.h
syscalls.h
sysctl.h
sysdev.h
sysfs.h
sysrq.h
sysv_fs.h
task_io_accounting_ops.h
task_io_accounting.h
taskstats_kern.h
taskstats.h
tc.h
tcp.h
telephony.h
termios.h
textsearch_fsm.h
textsearch.h
tfrc.h
thermal.h ACPI: thermal: buildfix for CONFIG_THERMAL=n 2008-02-09 04:01:48 -05:00
thread_info.h
threads.h
tick.h
tifm.h memstick: initial commit for Sony MemoryStick support 2008-02-09 11:08:34 -08:00
time.h timekeeping: rename timekeeping_is_continuous to timekeeping_valid_for_hres 2008-02-08 09:22:29 -08:00
timer.h workqueue: make delayed_work_timer_fn() static 2008-02-08 09:22:37 -08:00
timerfd.h
times.h
timex.h ntp: correct inconsistent interval/tick_length usage 2008-02-10 10:48:03 +01:00
tiocl.h
tipc_config.h
tipc.h
topology.h
toshiba.h
transport_class.h
trdevice.h
tsacct_kern.h
tty_driver.h
tty_flip.h
tty_ldisc.h
tty.h
types.h Remove __STRICT_ANSI__ from linux/types.h 2008-02-08 09:22:39 -08:00
uaccess.h
udf_fs_i.h
udf_fs_sb.h udf: remove some ugly macros 2008-02-08 09:22:34 -08:00
udf_fs.h kill UDFFS_{DATE,VERSION} 2008-02-08 09:22:36 -08:00
udp.h
uinput.h
uio_driver.h
uio.h
ultrasound.h
un.h
unistd.h
unwind.h
usb_usual.h
usb.h
usbdevice_fs.h
user_namespace.h
user.h
utime.h
uts.h
utsname.h namespaces: move the UTS namespace under UTS_NS option 2008-02-08 09:22:23 -08:00
vermagic.h
veth.h
vfs.h
via.h
video_decoder.h
video_encoder.h
video_output.h
videodev2.h
videodev.h
videotext.h
virtio_9p.h
virtio_balloon.h
virtio_blk.h
virtio_config.h
virtio_console.h
virtio_net.h
virtio_pci.h
virtio_ring.h
virtio.h
vmalloc.h
vmstat.h
vt_buffer.h
vt_kern.h
vt.h
w1-gpio.h
wait.h
wanrouter.h
watchdog.h
wireless.h
workqueue.h
writeback.h
x25.h
xattr.h
xfrm.h
xilinxfb.h
yam.h
zconf.h
zlib.h
zorro_ids.h
zorro.h
zutil.h