This patch unifies 'list_sort_test()' messages a bit and makes sure all of
them start with the "list_sort_test:" prefix to make it obvious for users
where the messages come from.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: Don Mullis <don.mullis@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The 'lib_sort()' test does not free memory if it fails, and it makes the
kernel panic if it cannot allocate memory. This patch fixes the problem.
This patch also changes several small things:
o use 'list_add()' helper instead of adding manually
o introduce temporary 'el1' variable to avoid ugly and unreadalbe
"if" statement
o make 'head' to be stack variable instead of 'kmalloc()'ed, which
simplifies code a bit
Overall, this patch is of clean-up type.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: Don Mullis <don.mullis@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Instead of using own pseudo-random generator, use generic linux
'random32()' function. Presumably, this should improve test coverage.
At the same time, do the following changes:
o Use shorter macro name for test list length
o Do not use strange 'l_h' name for 'struct list_head' element,
use 'list', because it is traditional name and thus, makes the
code more obvious and readable.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: Don Mullis <don.mullis@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I do not see any reason to use KERN_WARN for normal messages and
KERN_EMERG for error messages in the lib_sort testing routine. Let's use
more reasonable KERN_NORM and KERN_ERR levels.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: Don Mullis <don.mullis@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While hunting a non-existing bug in 'list_sort()', I've improved the
'list_sort_test()' function which tests the 'list_sort()' library call.
Although at the end I found a bug in my code, but not in 'list_sort()', I
think my clean-ups and improvements are worth merging because they make
the test function better.
This patch:
Make the self-tests selectable via Kconfig rather than by manual enabling
of DEBUG_LIST_SORT.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: Don Mullis <don.mullis@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All percpu counters are linked to a global list on initialization and
removed from it on destruction. The list is walked during CPU up/down.
If a percpu counter is freed without being properly destroyed, the system
will oops only on the next CPU up/down making it pretty nasty to track
down. This patch adds debugobj support for percpu counters so that such
problems can be found easily.
As percpu counters don't make sense on stack and can't be statically
initialized, debugobj support is pretty simple. It's initialized and
activated on counter initialization, and deactivatd and destroyed on
counter destruction. With this patch applied, the bug fixed by commit
602586a83b (shmem: put_super must
percpu_counter_destroy) triggers the following warning on tmpfs unmount
and the system won't oops on the next cpu up/down operation.
------------[ cut here ]------------
WARNING: at lib/debugobjects.c:259 debug_print_object+0x5c/0x70()
Hardware name: Bochs
ODEBUG: free active (active state 0) object type: percpu_counter
Modules linked in:
Pid: 3999, comm: umount Not tainted 2.6.36-rc2-work+ #5
Call Trace:
[<ffffffff81083f7f>] warn_slowpath_common+0x7f/0xc0
[<ffffffff81084076>] warn_slowpath_fmt+0x46/0x50
[<ffffffff813b45cc>] debug_print_object+0x5c/0x70
[<ffffffff813b50e5>] debug_check_no_obj_freed+0x125/0x210
[<ffffffff811577d3>] kfree+0xb3/0x2f0
[<ffffffff81132edd>] shmem_put_super+0x1d/0x30
[<ffffffff81162e96>] generic_shutdown_super+0x56/0xe0
[<ffffffff81162f86>] kill_anon_super+0x16/0x60
[<ffffffff81162ff7>] kill_litter_super+0x27/0x30
[<ffffffff81163295>] deactivate_locked_super+0x45/0x60
[<ffffffff81163cfa>] deactivate_super+0x4a/0x70
[<ffffffff8117d446>] mntput_no_expire+0x86/0xe0
[<ffffffff8117df7f>] sys_umount+0x6f/0x360
[<ffffffff8103f01b>] system_call_fastpath+0x16/0x1b
---[ end trace cce2a341ba3611a7 ]---
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Thomas Gleixner <tglxlinutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Despite the idr_pre_get() kernel-doc, there are some cases where you can
call idr_pre_get() from within locked regions. Add a description for such
cases.
See also: http://lkml.org/lkml/2010/9/16/462
[akpm@linux-foundation.org: cleanups, grammatical fixes]
Signed-off-by: Naohiro Aota <naota@elisp.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
scnprintf() should return 0 if @size is == 0. Update the comment for it,
as @size is unsigned.
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (31 commits)
driver core: Display error codes when class suspend fails
Driver core: Add section count to memory_block struct
Driver core: Add mutex for adding/removing memory blocks
Driver core: Move find_memory_block routine
hpilo: Despecificate driver from iLO generation
driver core: Convert link_mem_sections to use find_memory_block_hinted.
driver core: Introduce find_memory_block_hinted which utilizes kset_find_obj_hinted.
kobject: Introduce kset_find_obj_hinted.
driver core: fix build for CONFIG_BLOCK not enabled
driver-core: base: change to new flag variable
sysfs: only access bin file vm_ops with the active lock
sysfs: Fail bin file mmap if vma close is implemented.
FW_LOADER: fix kconfig dependency warning on HOTPLUG
uio: Statically allocate uio_class and use class .dev_attrs.
uio: Support 2^MINOR_BITS minors
uio: Cleanup irq handling.
uio: Don't clear driver data
uio: Fix lack of locking in init_uio_class
SYSFS: Allow boot time switching between deprecated and modern sysfs layout
driver core: remove CONFIG_SYSFS_DEPRECATED_V2 but keep it for block devices
...
* 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
vfs: make no_llseek the default
vfs: don't use BKL in default_llseek
llseek: automatically add .llseek fop
libfs: use generic_file_llseek for simple_attr
mac80211: disallow seeks in minstrel debug code
lirc: make chardev nonseekable
viotape: use noop_llseek
raw: use explicit llseek file operations
ibmasmfs: use generic_file_llseek
spufs: use llseek in all file operations
arm/omap: use generic_file_llseek in iommu_debug
lkdtm: use generic_file_llseek in debugfs
net/wireless: use generic_file_llseek in debugfs
drm: use noop_llseek
* 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
BKL: introduce CONFIG_BKL.
dabusb: remove the BKL
sunrpc: remove the big kernel lock
init/main.c: remove BKL notations
blktrace: remove the big kernel lock
rtmutex-tester: make it build without BKL
dvb-core: kill the big kernel lock
dvb/bt8xx: kill the big kernel lock
tlclk: remove big kernel lock
fix rawctl compat ioctls breakage on amd64 and itanic
uml: kill big kernel lock
parisc: remove big kernel lock
cris: autoconvert trivial BKL users
alpha: kill big kernel lock
isapnp: BKL removal
s390/block: kill the big kernel lock
hpet: kill BKL, add compat_ioctl
One call chain getting to kset_find_obj is:
link_mem_sections()
find_mem_section()
kset_find_obj()
This is done during boot. The memory sections were added in a linearly
increasing order and link_mem_sections tends to utilize them in that
same linear order.
Introduce a kset_find_obj_hinted which is passed the result of the
previous kset_find_obj which it uses for a quick "is the next object
our desired object" check before falling back to the old behavior.
Signed-off-by: Robin Holt <holt@sgi.com>
To: Robert P. J. Day <rpjday@crashcourse.ca>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Having the ddebug_query= boot parameter it makes sense to set up
dynamic debug as soon as possible.
I expect sysfs files cannot be set up via an arch_initcall, because
this one is even before fs_initcall. Therefore I splitted the
dynamic_debug_init function into an early one and a later one providing
/sys/../dynamic_debug/control file.
Possibly dynamic_debug can be initialized even earlier, not sure whether
this still makes sense then. I picked up arch_initcall as it covers
quite a lot already.
Dynamic debug needs to allocate memory, therefore it's not easily possible to
set it up even before the command line gets parsed.
Therefore the boot param query string is stored in a temp string which is
applied when dynamic debug gets set up.
This has been tested with ddebug_query="file ec.c +p"
and I could retrieve pr_debug() messages early at boot during ACPI setup:
ACPI: EC: Look up EC in DSDT
ACPI: EC: ---> status = 0x08
ACPI: EC: transaction start
ACPI: EC: <--- command = 0x80
ACPI: EC: ~~~> interrupt
ACPI: EC: ---> status = 0x08
ACPI: EC: <--- data = 0xa4
...
ACPI: Interpreter enabled
ACPI: (supports S0 S3 S4 S5)
ACPI: Using IOAPIC for interrupt routing
ACPI: EC: ---> status = 0x00
ACPI: EC: transaction start
ACPI: EC: <--- command = 0x80
Signed-off-by: Thomas Renninger <trenn@suse.de>
Acked-by: jbaron@redhat.com
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
CC: linux-acpi@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dynamic debug lacks the ability to enable debug messages at boot time.
One could patch initramfs or service startup scripts to write to
/sys/../dynamic_debug/control, but this sucks.
This patch makes it possible to pass a query in the same format one can
write to /sys/../dynamic_debug/control via boot param.
When dynamic debug gets initialized, this query will automatically be
applied.
Signed-off-by: Thomas Renninger <trenn@suse.de>
Acked-by: jbaron@redhat.com
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The parsing and applying of dynamic debug strings is not only useful for
/sys/../dynamic_debug/control write access, but can also be used for
boot parameter parsing.
The boot parameter is introduced in a follow up patch.
Signed-off-by: Thomas Renninger <trenn@suse.de>
Acked-by: jbaron@redhat.com
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* 'stable/swiotlb-0.9' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb-2.6:
swiotlb: Use page alignment for early buffer allocation
swiotlb: make io_tlb_overflow static
With all the patches we have queued in the BKL removal tree, only a
few dozen modules are left that actually rely on the BKL, and even
there are lots of low-hanging fruit. We need to decide what to do
about them, this patch illustrates one of the options:
Every user of the BKL is marked as 'depends on BKL' in Kconfig,
and the CONFIG_BKL becomes a user-visible option. If it gets
disabled, no BKL using module can be built any more and the BKL
code itself is compiled out.
The one exception is file locking, which is practically always
enabled and does a 'select BKL' instead. This effectively forces
CONFIG_BKL to be enabled until we have solved the fs/lockd
mess and can apply the patch that removes the BKL from fs/locks.c.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Now that there's still only a few users around, rename things to make
them more consistent.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101014203625.448565169@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
All file_operations should get a .llseek operation so we can make
nonseekable_open the default for future file operations without a
.llseek pointer.
The three cases that we can automatically detect are no_llseek, seq_lseek
and default_llseek. For cases where we can we can automatically prove that
the file offset is always ignored, we use noop_llseek, which maintains
the current behavior of not returning an error from a seek.
New drivers should normally not use noop_llseek but instead use no_llseek
and call nonseekable_open at open time. Existing drivers can be converted
to do the same when the maintainer knows for certain that no user code
relies on calling seek on the device file.
The generated code is often incorrectly indented and right now contains
comments that clarify for each added line why a specific variant was
chosen. In the version that gets submitted upstream, the comments will
be gone and I will manually fix the indentation, because there does not
seem to be a way to do that using coccinelle.
Some amount of new code is currently sitting in linux-next that should get
the same modifications, which I will do at the end of the merge window.
Many thanks to Julia Lawall for helping me learn to write a semantic
patch that does all this.
===== begin semantic patch =====
// This adds an llseek= method to all file operations,
// as a preparation for making no_llseek the default.
//
// The rules are
// - use no_llseek explicitly if we do nonseekable_open
// - use seq_lseek for sequential files
// - use default_llseek if we know we access f_pos
// - use noop_llseek if we know we don't access f_pos,
// but we still want to allow users to call lseek
//
@ open1 exists @
identifier nested_open;
@@
nested_open(...)
{
<+...
nonseekable_open(...)
...+>
}
@ open exists@
identifier open_f;
identifier i, f;
identifier open1.nested_open;
@@
int open_f(struct inode *i, struct file *f)
{
<+...
(
nonseekable_open(...)
|
nested_open(...)
)
...+>
}
@ read disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
<+...
(
*off = E
|
*off += E
|
func(..., off, ...)
|
E = *off
)
...+>
}
@ read_no_fpos disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
... when != off
}
@ write @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
<+...
(
*off = E
|
*off += E
|
func(..., off, ...)
|
E = *off
)
...+>
}
@ write_no_fpos @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
... when != off
}
@ fops0 @
identifier fops;
@@
struct file_operations fops = {
...
};
@ has_llseek depends on fops0 @
identifier fops0.fops;
identifier llseek_f;
@@
struct file_operations fops = {
...
.llseek = llseek_f,
...
};
@ has_read depends on fops0 @
identifier fops0.fops;
identifier read_f;
@@
struct file_operations fops = {
...
.read = read_f,
...
};
@ has_write depends on fops0 @
identifier fops0.fops;
identifier write_f;
@@
struct file_operations fops = {
...
.write = write_f,
...
};
@ has_open depends on fops0 @
identifier fops0.fops;
identifier open_f;
@@
struct file_operations fops = {
...
.open = open_f,
...
};
// use no_llseek if we call nonseekable_open
////////////////////////////////////////////
@ nonseekable1 depends on !has_llseek && has_open @
identifier fops0.fops;
identifier nso ~= "nonseekable_open";
@@
struct file_operations fops = {
... .open = nso, ...
+.llseek = no_llseek, /* nonseekable */
};
@ nonseekable2 depends on !has_llseek @
identifier fops0.fops;
identifier open.open_f;
@@
struct file_operations fops = {
... .open = open_f, ...
+.llseek = no_llseek, /* open uses nonseekable */
};
// use seq_lseek for sequential files
/////////////////////////////////////
@ seq depends on !has_llseek @
identifier fops0.fops;
identifier sr ~= "seq_read";
@@
struct file_operations fops = {
... .read = sr, ...
+.llseek = seq_lseek, /* we have seq_read */
};
// use default_llseek if there is a readdir
///////////////////////////////////////////
@ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier readdir_e;
@@
// any other fop is used that changes pos
struct file_operations fops = {
... .readdir = readdir_e, ...
+.llseek = default_llseek, /* readdir is present */
};
// use default_llseek if at least one of read/write touches f_pos
/////////////////////////////////////////////////////////////////
@ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read.read_f;
@@
// read fops use offset
struct file_operations fops = {
... .read = read_f, ...
+.llseek = default_llseek, /* read accesses f_pos */
};
@ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write.write_f;
@@
// write fops use offset
struct file_operations fops = {
... .write = write_f, ...
+ .llseek = default_llseek, /* write accesses f_pos */
};
// Use noop_llseek if neither read nor write accesses f_pos
///////////////////////////////////////////////////////////
@ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
identifier write_no_fpos.write_f;
@@
// write fops use offset
struct file_operations fops = {
...
.write = write_f,
.read = read_f,
...
+.llseek = noop_llseek, /* read and write both use no f_pos */
};
@ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write_no_fpos.write_f;
@@
struct file_operations fops = {
... .write = write_f, ...
+.llseek = noop_llseek, /* write uses no f_pos */
};
@ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
@@
struct file_operations fops = {
... .read = read_f, ...
+.llseek = noop_llseek, /* read uses no f_pos */
};
@ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
@@
struct file_operations fops = {
...
+.llseek = noop_llseek, /* no read or write fn */
};
===== End semantic patch =====
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Julia Lawall <julia@diku.dk>
Cc: Christoph Hellwig <hch@infradead.org>
We could call free_bootmem_late() if swiotlb is not used, and
it will shrink to page alignment.
So alloc them with page alignment at first, to avoid lose two pages
before patch:
[ 0.000000] memblock_x86_reserve_range: [00d3600000, 00d7600000] swiotlb buffer
[ 0.000000] memblock_x86_reserve_range: [00d7e7ef40, 00d7e9ef40] swiotlb list
[ 0.000000] memblock_x86_reserve_range: [00d7e3ef40, 00d7e7ef40] swiotlb orig_ad
[ 0.000000] memblock_x86_reserve_range: [000008a000, 0000092000] swiotlb overflo
after patch will get
[ 0.000000] memblock_x86_reserve_range: [00d3600000, 00d7600000] swiotlb buffer
[ 0.000000] memblock_x86_reserve_range: [00d7e7e000, 00d7e9e000] swiotlb list
[ 0.000000] memblock_x86_reserve_range: [00d7e3e000, 00d7e7e000] swiotlb orig_ad
[ 0.000000] memblock_x86_reserve_range: [000008a000, 0000092000] swiotlb overflo
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
We don't need to export io_tlb_overflow_buffer. I'll remove
io_tlb_overflow_buffer completely in the long term though.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Currently disabling CONFIG_SLUB_DEBUG also disabled SYSFS support meaning
that the slabs cannot be tuned without DEBUG.
Make SYSFS support independent of CONFIG_SLUB_DEBUG
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
With all the recent module loading cleanups, we've minimized the code
that sits under module_mutex, fixing various deadlocks and making it
possible to do most of the module loading in parallel.
However, that whole conversion totally missed the rather obscure code
that adds a new module to the list for BUG() handling. That code was
doubly obscure because (a) the code itself lives in lib/bugs.c (for
dubious reasons) and (b) it gets called from the architecture-specific
"module_finalize()" rather than from generic code.
Calling it from arch-specific code makes no sense what-so-ever to begin
with, and is now actively wrong since that code isn't protected by the
module loading lock any more.
So this commit moves the "module_bug_{finalize,cleanup}()" calls away
from the arch-specific code, and into the generic code - and in the
process protects it with the module_mutex so that the list operations
are now safe.
Future fixups:
- move the module list handling code into kernel/module.c where it
belongs.
- get rid of 'module_bug_list' and just use the regular list of modules
(called 'modules' - imagine that) that we already create and maintain
for other reasons.
Reported-and-tested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Adrian Bunk <bunk@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If the original list is a POT in length, the first callback from line 73
will pass a==b both pointing to the original list_head. This is dangerous
because the 'list_sort()' user can use 'container_of()' and accesses the
"containing" object, which does not necessary exist for the list head. So
the user can access RAM which does not belong to him. If this is a write
access, we can end up with memory corruption.
Signed-off-by: Don Mullis <don.mullis@gmail.com>
Tested-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The PROVE_RCU_REPEATEDLY has no "Say Y"/"Say N" advice, so this commit
adds it.
Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Convert the 'dynamic debug' infrastructure to use jump labels.
Signed-off-by: Jason Baron <jbaron@redhat.com>
LKML-Reference: <b77627358cea3e27d7be4386f45f66219afb8452.1284733808.git.jbaron@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: Range check cpu in blk_cpu_to_group
scatterlist: prevent invalid free when alloc fails
writeback: Fix lost wake-up shutting down writeback thread
writeback: do not lose wakeup events when forking bdi threads
cciss: fix reporting of max queue depth since init
block: switch s390 tape_block and mg_disk to elevator_change()
block: add function call to switch the IO scheduler from a driver
fs/bio-integrity.c: return -ENOMEM on kmalloc failure
bio-integrity.c: remove dependency on __GFP_NOFAIL
BLOCK: fix bio.bi_rw handling
block: put dev->kobj in blk_register_queue fail path
cciss: handle allocation failure
cfq-iosched: Documentation help for new tunables
cfq-iosched: blktrace print per slice sector stats
cfq-iosched: Implement tunable group_idle
cfq-iosched: Do group share accounting in IOPS when slice_idle=0
cfq-iosched: Do not idle if slice_idle=0
cciss: disable doorbell reset on reset_devices
blkio: Fix return code for mkdir calls
When CONFIG_IRQSOFF_TRACER is set and CONFIG_PROVE_LOCKING is not, we
get the following error:
$ make oldconfig
scripts/kconfig/conf --oldconfig arch/x86/Kconfig
warning: (IRQSOFF_TRACER && TRACING_SUPPORT && FTRACE && TRACE_IRQFLAGS_SUPPORT && !ARCH_USES_GETTIMEOFFSET) selects TRACE_IRQFLAGS which has unmet direct dependencies (DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && PROVE_LOCKING)
warning: (IRQSOFF_TRACER && TRACING_SUPPORT && FTRACE && TRACE_IRQFLAGS_SUPPORT && !ARCH_USES_GETTIMEOFFSET) selects TRACE_IRQFLAGS which has unmet direct dependencies (DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && PROVE_LOCKING)
This is because IRQSOFF_TRACER selects TRACE_IRQFLAGS but TRACE_IRQFLAGS
has PROVE_LOCKING as a dependency. This code is incorrect, and
this patch changes the TRACE_IRQFLAGS to be just a simple bool that
does not depend or select anything. Instead both IRQSOFF_TRACER and
PROVE_LOCKING select it.
Reported-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
It was unclear in original kernel-doc how nextidp worked in
idr_get_next(). Let's describe it.
Signed-off-by: Naohiro Aota <naota@elisp.net>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Fix the following kernel-doc warnings.
% perl scripts/kernel-doc lib/idr.c > /dev/null
Warning(lib/idr.c:300): No description found for parameter 'starting_id'
Warning(lib/idr.c:300): Excess function parameter 'start_id' description in 'idr_get_new_above'
Warning(lib/idr.c:485): No description found for parameter 'idp'
Warning(lib/idr.c:596): No description found for parameter 'nextidp'
Warning(lib/idr.c:596): Excess function parameter 'id' description in 'idr_get_next'
Warning(lib/idr.c:774): No description found for parameter 'starting_id'
Warning(lib/idr.c:774): Excess function parameter 'staring_id' description in 'ida_get_new_above'
Warning(lib/idr.c:918): No description found for parameter 'ida'
Signed-off-by: Naohiro Aota <naota@elisp.net>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
When alloc fails, free_table is being called. Depending on the number of
bytes requested, we determine if we are going to call _get_free_page()
or kmalloc(). When alloc fails, our math is wrong (due to sg_size - 1),
and the last buffer is wrongfully assumed to have been allocated by
kmalloc. Hence, kfree gets called and a panic occurs.
Signed-off-by: Jeffrey Carlyle <jeff.carlyle@motorola.com>
Signed-off-by: Olusanya Soyannwo <c23746@motorola.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
* 'radix-tree' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/xfsdev:
radix-tree: radix_tree_range_tag_if_tagged() can set incorrect tags
radix-tree: clear all tags in radix_tree_node_rcu_free
Commit ebf8aa44be ("radix-tree:
omplement function radix_tree_range_tag_if_tagged") does not safely
set tags on on intermediate tree nodes. The code walks down the tree
setting tags before it has fully resolved the path to the leaf under
the assumption there will be a leaf slot with the tag set in the
range it is searching.
Unfortunately, this is not a valid assumption - we can abort after
setting a tag on an intermediate node if we overrun the number of
tags we are allowed to set in a batch, or stop scanning because we
we have passed the last scan index before we reach a leaf slot with
the tag we are searching for set.
As a result, we can leave the function with tags set on intemediate
nodes which can be tripped over later by tag-based lookups. The
result of these stale tags is that lookup may end prematurely or
livelock because the lookup cannot make progress.
The fix for the problem involves reocrding the traversal path we
take to the leaf nodes, and only propagating the tags back up the
tree once the tag is set in the leaf node slot. We are already
recording the path for efficient traversal, so there is no
additional overhead to do the intermediately node tag setting in
this manner.
This fixes a radix tree lookup livelock triggered by the new
writeback sync livelock avoidance code introduced in commit
f446daaea9 ("mm: implement writeback
livelock avoidance using page tagging").
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Acked-by: Jan Kara <jack@suse.cz>
Commit f446daaea9 ("mm: implement
writeback livelock avoidance using page tagging") introduced a new
radix tree tag, increasing the number of tags in each node from 2 to
3. It did not, however, fix up the code in
radix_tree_node_rcu_free() that cleans up after radix_tree_shrink()
and hence could leave stray tags set in the new tag array.
The result is that the livelock avoidance code added in the the
above commit would hit stale tags when doing tag based lookups,
resulting in livelocks when trying to traverse the tree.
Fix this problem in radix_tree_node_rcu_free() so it doesn't happen
again in the future by using a loop to walk all the tags up to
RADIX_TREE_MAX_TAGS to clear the stray tags radix_tree_shrink()
leaves behind.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Acked-by: Nick Piggin <npiggin@kernel.dk>
Acked-by: Jan Kara <jack@suse.cz>
When radix_tree_maxindex() is ~0UL, it can happen that scanning overflows
index and tree traversal code goes astray reading memory until it hits
unreadable memory. Check for overflow and exit in that case.
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, if RCU CPU stall warnings are enabled, they are enabled
immediately upon boot. They can be manually disabled via /sys (and
also re-enabled via /sys), and are automatically disabled upon panic.
However, some users need RCU CPU stalls to be disabled at boot time,
but to be enabled without rebuilding/rebooting. For example, someone
running a real-time application in production might not want the
additional latency of RCU CPU stall detection in normal operation, but
might need to enable it at any point for fault isolation purposes.
This commit therefore provides a new CONFIG_RCU_CPU_STALL_DETECTOR_RUNNABLE
kernel configuration parameter that maintains the current behavior
(enable at boot) by default, but allows a kernel to be configured
with RCU CPU stall detection built into the kernel, but disabled at
boot time.
Requested-by: Clark Williams <williams@redhat.com>
Requested-by: John Kacur <jkacur@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>