kernel_optimize_test/arch/x86
Linus Torvalds e56b3bc794 cpu masks: optimize and clean up cpumask_of_cpu()
Clean up and optimize cpumask_of_cpu(), by sharing all the zero words.

Instead of stupidly generating all possible i=0...NR_CPUS 2^i patterns
creating a huge array of constant bitmasks, realize that the zero words
can be shared.

In other words, on a 64-bit architecture, we only ever need 64 of these
arrays - with a different bit set in one single world (with enough zero
words around it so that we can create any bitmask by just offsetting in
that big array). And then we just put enough zeroes around it that we
can point every single cpumask to be one of those things.

So when we have 4k CPU's, instead of having 4k arrays (of 4k bits each,
with one bit set in each array - 2MB memory total), we have exactly 64
arrays instead, each 8k bits in size (64kB total).

And then we just point cpumask(n) to the right position (which we can
calculate dynamically). Once we have the right arrays, getting
"cpumask(n)" ends up being:

  static inline const cpumask_t *get_cpu_mask(unsigned int cpu)
  {
          const unsigned long *p = cpu_bit_bitmap[1 + cpu % BITS_PER_LONG];
          p -= cpu / BITS_PER_LONG;
          return (const cpumask_t *)p;
  }

This brings other advantages and simplifications as well:

 - we are not wasting memory that is just filled with a single bit in
   various different places

 - we don't need all those games to re-create the arrays in some dense
   format, because they're already going to be dense enough.

if we compile a kernel for up to 4k CPU's, "wasting" that 64kB of memory
is a non-issue (especially since by doing this "overlapping" trick we
probably get better cache behaviour anyway).

[ mingo@elte.hu:

  Converted Linus's mails into a commit. See:

     http://lkml.org/lkml/2008/7/27/156
     http://lkml.org/lkml/2008/7/28/320

  Also applied a family filter - which also has the side-effect of leaving
  out the bits where Linus calls me an idio... Oh, never mind ;-)
]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28 22:20:41 +02:00
..
boot inflate: refactor inflate malloc code 2008-07-25 10:53:28 -07:00
configs Subject: devmem, x86: fix rename of CONFIG_NONPROMISC_DEVMEM 2008-07-20 08:35:55 +02:00
crypto [CRYPTO] aes-x86-32: Remove unused return code 2008-04-21 10:19:21 +08:00
ia32 tracehook: exec 2008-07-26 12:00:08 -07:00
kernel cpu masks: optimize and clean up cpumask_of_cpu() 2008-07-28 22:20:41 +02:00
kvm KVM: VMX: Fix undefined beaviour of EPT after reload kvm-intel.ko 2008-07-27 11:34:10 +03:00
lguest x86: APIC: remove apic_write_around(); use alternatives 2008-07-18 12:51:21 +02:00
lib Merge branch 'generic-ipi' into generic-ipi-for-linus 2008-07-15 21:55:59 +02:00
mach-default x86: add ->pre_time_init to x86_quirks 2008-07-20 09:25:52 +02:00
mach-es7000 x86: move the last Dprintk instance to pr_debug() 2008-07-21 21:58:34 +02:00
mach-generic x86: make generic arch support NUMAQ, fix 2008-07-08 10:35:45 +02:00
mach-rdc321x x86, rdc321x: remove watchdog file 2008-04-17 17:40:50 +02:00
mach-voyager Merge branch 'generic-ipi' into generic-ipi-for-linus 2008-07-15 21:55:59 +02:00
math-emu x86: coding style fixes to arch/x86/math-emu/reg_constant 2008-06-18 15:00:13 +02:00
mm x86: use generic show_mem() 2008-07-26 12:00:10 -07:00
oprofile x86/oprofile/nmi_int: add Nehalem to list of ppro cores 2008-07-24 17:29:00 -07:00
pci use generic_access_phys for /dev/mem mappings 2008-07-24 10:47:15 -07:00
power x86: remove end_pfn in 64bit 2008-07-08 13:10:38 +02:00
vdso Merge branches 'x86/urgent', 'x86/amd-iommu', 'x86/apic', 'x86/cleanups', 'x86/core', 'x86/cpu', 'x86/fixmap', 'x86/gart', 'x86/kprobes', 'x86/memtest', 'x86/modules', 'x86/nmi', 'x86/pat', 'x86/reboot', 'x86/setup', 'x86/step', 'x86/unify-pci', 'x86/uv', 'x86/xen' and 'xen-64bit' into x86/for-linus 2008-07-21 16:37:17 +02:00
video x86: video/fbdev.c: add MODULE_LICENSE 2008-05-04 20:04:46 +02:00
xen xen: don't use sysret for sysexit32 2008-07-24 12:28:12 +02:00
Kconfig Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-26 13:25:05 -07:00
Kconfig.cpu x86: fix crash due to missing debugctlmsr on AMD K6-3 2008-07-22 14:16:37 +02:00
Kconfig.debug x86: Fix help message for STRICT_DEVMEM config option 2008-07-21 13:04:08 -07:00
Makefile x86, RDC321x: add to mach-default 2008-07-26 13:51:46 +02:00
Makefile_32.cpu