If there are a sync and an async queue and the sync queue's think time
is small, we can ignore the sync queue's dispatch quantum. Because the
sync queue will always preempt the async queue, we don't need to care
about async's latency. This can fix a performance regression of
aiostress test, which is introduced by commit f8ae6e3eb8. The issue
should exist even without the commit, but the commit amplifies the
impact.
The initial post does the same optimization for RT queue too, but since
I have no real workload for it, Vivek suggests to drop it.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Reviewed-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Move blk_throtl_exit() in blk_cleanup_queue() as blk_throtl_exit() is
written in such a way that it needs queue lock. In blk_release_queue()
there is no gurantee that ->queue_lock is still around.
Initially blk_throtl_exit() was in blk_cleanup_queue() but Ingo reported
one problem.
https://lkml.org/lkml/2010/10/23/86
And a quick fix moved blk_throtl_exit() to blk_release_queue().
commit 7ad58c0286
Author: Jens Axboe <jaxboe@fusionio.com>
Date: Sat Oct 23 20:40:26 2010 +0200
block: fix use-after-free bug in blk throttle code
This patch reverts above change and does not try to shutdown the
throtl work in blk_sync_queue(). By avoiding call to
throtl_shutdown_timer_wq() from blk_sync_queue(), we should also avoid
the problem reported by Ingo.
blk_sync_queue() seems to be used only by md driver and it seems to be
using it to make sure q->unplug_fn is not called as md registers its
own unplug functions and it is about to free up the data structures
used by unplug_fn(). Block throttle does not call back into unplug_fn()
or into md. So there is no need to cancel blk throttle work.
In fact I think cancelling block throttle work is bad because it might
happen that some bios are throttled and scheduled to be dispatched later
with the help of pending work and if work is cancelled, these bios might
never be dispatched.
Block layer also uses blk_sync_queue() during blk_cleanup_queue() and
blk_release_queue() time. That should be safe as we are also calling
blk_throtl_exit() which should make sure all the throttling related
data structures are cleaned up.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Now we initialize ->queue_lock at queue allocation time so driver does
not have to worry about initializing it before calling
blk_cleanup_queue().
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
There does not seem to be a clear convention whether q->queue_lock is
initialized or not when blk_cleanup_queue() is called. In the past it
was not necessary but now blk_throtl_exit() takes up queue lock by
default and needs queue lock to be available.
In fact elevator_exit() code also has similar requirement just that it
is less stringent in the sense that elevator_exit() is called only if
elevator is initialized.
Two problems have been noticed because of ambiguity about spin lock
status.
- If a driver calls blk_alloc_queue() and then soon calls
blk_cleanup_queue() almost immediately, (because some other
driver structure allocation failed or some other error happened)
then blk_throtl_exit() will run into issues as queue lock is not
initialized. Loop driver ran into this issue recently and I
noticed error paths in md driver too. Similar error paths should
exist in other drivers too.
- If some driver provided external spin lock and zapped the lock
before blk_cleanup_queue(), then it can lead to issues.
So this patch initializes the default queue lock at queue allocation time.
block throttling code is one of the users of queue lock and it is
initialized at the queue allocation time, so it makes sense to
initialize ->queue_lock also to internal lock. A driver can overide that
lock later. This will take care of the issue where a driver does not have
to worry about initializing the queue lock to default before calling
blk_cleanup_queue()
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Rename the numerals in the diskstats_show() into the macros.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Effectively, make group_isolation=1 the default and remove the tunable.
The setting group_isolation=0 was because by default we idle on
sync-noidle tree and on fast devices, this can be very harmful for
throughput.
However, this problem can also be addressed by tuning slice_idle and
possibly group_idle on faster storage devices.
This change simplifies the CFQ code by removing the feature entirely.
Signed-off-by: Justin TerAvest <teravest@google.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6:
eCryptfs: Copy up lower inode attrs in getattr
ecryptfs: read on a directory should return EISDIR if not supported
eCryptfs: Handle NULL nameidata pointers
eCryptfs: Revert "dont call lookup_one_len to avoid NULL nameidata"
The current code does not follow Intel documentation: It misses some things
and does other, undocumented things. This causes wrong backlight values in
certain conditions. Instead of adding tricky code handling badly documented
and rare corner cases, don't handle combination mode specially at all. This
way PCI_LBPC is never touched and weird things shouldn't happen.
If combination mode is enabled, then the only downside is that changing the
brightness has a greater granularity (the LBPC value), but LBPC is at most
254 and the maximum is in the thousands, so this is no real functional loss.
A potential problem with not handling combined mode is that a brightness of
max * PCI_LBPC is not bright enough. However, this is very unlikely because
from the documentation LBPC seems to act as a scaling factor and doesn't look
like it's supposed to be changed after boot. The value at boot should always
result in a bright enough screen.
IMPORTANT: However, although usually the above is true, it may not be when
people ran an older (2.6.37) kernel which messed up the LBPC register, and
they are unlucky enough to have a BIOS that saves and restores the LBPC value.
Then a good kernel may seem to not work: Max brightness isn't bright enough.
If this happens people should boot back into the old kernel, set brightness
to the maximum, and then reboot. After that everything should be fine.
For more information see the below links. This fixes bugs:
http://bugzilla.kernel.org/show_bug.cgi?id=23472http://bugzilla.kernel.org/show_bug.cgi?id=25072
Signed-off-by: Indan Zupancic <indan@nul.nu>
Tested-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We force particular alignment when we generate attribute structures
when generation MODULE_VERSION() data and we need to make sure that
this alignment is followed when we iterate over these structures,
otherwise we may crash on platforms whose natural alignment is not
sizeof(void *), such as m68k.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Dmitry Torokhov <dtor@vmware.com>
[ There are more issues here, but the fixes are incredibly ugly - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Update the "log_buf_len" description to use [KMG] syntax for the
buffer size.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The '[KMG]' suffix is commonly described after a number of kernel
parameter values documentation. Explicitly state its semantics.
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm:
ARM: 6745/1: kprobes insn decoding fix
ARM: tlb: move noMMU tlb_flush() to asm/tlb.h
ARM: tlb: delay page freeing for SMP and ARMv7 CPUs
ARM: Keep exit text/data around for SMP_ON_UP
ARM: Ensure predictable endian state on signal handler entry
ARM: 6740/1: Place correctly notes section in the linker script
ARM: 6700/1: SPEAr: Correct SOC config base address for spear320
ARM: 6722/1: SPEAr: sp810: switch to slow mode before reset
ARM: 6712/1: SPEAr: replace readl(), writel() with relaxed versions in uncompress.h
ARM: 6720/1: SPEAr: Append UL to VMALLOC_END
ARM: 6676/1: Correct the cpu_architecture() function for ARMv7
ARM: 6739/1: update .gitignore for boot/compressed
ARM: 6743/1: errata: interrupted ICALLUIS may prevent completion of broadcasted operation
ARM: 6742/1: pmu: avoid setting IRQ affinity on UP systems
ARM: 6741/1: errata: pl310 cache sync operation may be faulty
It is found on Dell Inspiron 1018 that the firmware reports that the hardware
killswitch is not supported. This makes the rfkill key not functional.
This patch forces the driver to toggle the firmware rfkill status in the case
that the hardware killswitch is indicated as unsupported by the firmware.
Signed-off-by: Keng-Yu Lin <keng-yu.lin@canonical.com>
Tested-by: Alessio Igor Bogani <abogani@texware.it>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Some thinkpad hotkeys report key codes like KEY_FN_F8 when something
like KEY_VOLUMEDOWN is desired. Always provide the scan codes in
addition to the key codes to assist with debugging these issues. Also
send the scan code before the key code to match what other drivers do,
as some userspace utilities expect this ordering.
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
6AF4F258-B401-42fd-BE91-3D4AC2D7C0D3 needs to be
6AF4F258-B401-42FD-BE91-3D4AC2D7C0D3 to match the hardware alias.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Acked-by: Carlos Corbacho <carlos@strangeworlds.co.uk>
Cc: stable@kernel.org
Most platform/x86 drivers that use INPUT_SPARSEKMAP also depend on INPUT,
so do the same for ideapad-laptop. This fixes a kconfig warning and
subsequent build errors when CONFIG_INPUT is disabled.
warning: (ACER_WMI && ASUS_LAPTOP && DELL_WMI && HP_WMI && PANASONIC_LAPTOP && IDEAPAD_LAPTOP && EEEPC_LAPTOP && EEEPC_WMI && MSI_WMI && TOPSTAR_LAPTOP && ACPI_TOSHIBA) selects INPUT_SPARSEKMAP which has unmet direct dependencies (!S390 && INPUT)
ERROR: "input_free_device" [drivers/platform/x86/ideapad-laptop.ko] undefined!
ERROR: "input_register_device" [drivers/platform/x86/ideapad-laptop.ko] undefined!
ERROR: "sparse_keymap_setup" [drivers/platform/x86/ideapad-laptop.ko] undefined!
ERROR: "input_allocate_device" [drivers/platform/x86/ideapad-laptop.ko] undefined!
ERROR: "input_unregister_device" [drivers/platform/x86/ideapad-laptop.ko] undefined!
ERROR: "sparse_keymap_free" [drivers/platform/x86/ideapad-laptop.ko] undefined!
ERROR: "sparse_keymap_report_event" [drivers/platform/x86/ideapad-laptop.ko] undefined!
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Matthew Garrett <mjg@redhat.com>
Cc: platform-driver-x86@vger.kernel.org
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Don't allow everybody to change ACPI settings. The comment says that it
is done deliberatelly, however, the comment before disp_proc_write()
says that at least one of these setting is experimental.
Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
There is no need to install a chained handler for this hardware. This
is a plain x86 IOAPIC interrupt which is handled by the core code
perfectly fine. There is nothing special about demultiplexing these
gpio interrupts which justifies a custom hack. Replace it by a plain
old interrupt handler installed with request_irq. That makes the code
agnostic about the underlying primary interrupt hardware. The overhead
for this is minimal, but it gives us the advantage of accounting,
balancing and to detect interrupt storms. gpio interrupts are not
really that performance critical.
Patch fixups from akpm
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
LANMAN response length was changed to 16 bytes instead of 24 bytes.
Revert it back to 24 bytes.
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
CC: stable@kernel.org
Signed-off-by: Steve French <sfrench@us.ibm.com>
The lower filesystem may do some type of inode revalidation during a
getattr call. eCryptfs should take advantage of that by copying the
lower inode attributes to the eCryptfs inode after a call to
vfs_getattr() on the lower inode.
I originally wrote this fix while working on eCryptfs on nfsv3 support,
but discovered it also fixed an eCryptfs on ext4 nanosecond timestamp
bug that was reported.
https://bugs.launchpad.net/bugs/613873
Cc: <stable@kernel.org>
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
read() calls against a file descriptor connected to a directory are
incorrectly returning EINVAL rather than EISDIR:
[EISDIR]
[XSI] [Option Start] The fildes argument refers to a directory and the
implementation does not allow the directory to be read using read()
or pread(). The readdir() function should be used instead. [Option End]
This occurs because we do not have a .read operation defined for
ecryptfs directories. Connect this up to generic_read_dir().
BugLink: http://bugs.launchpad.net/bugs/719691
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Marcin Slusarz says:
> In arch/arm/kernel/kprobes-decode.c there's a function
> arm_kprobe_decode_insn which does:
>
> } else if ((insn & 0x0e000000) == 0x0c400000) {
> ...
>
> This is always false, so code below is dead.
> I found this bug by coccinelle (http://coccinelle.lip6.fr/).
Reported-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
There's no need to noMMU to put tlb_flush() in asm/tlbflush.h - it's
part of the tlb shootdown interface. Move it to asm/tlb.h instead, as
per x86.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We need to delay freeing any mapped page on SMP and ARMv7 systems to
ensure that the data is not accessed by other CPUs, or is used for
speculative prefetch with ARMv7. This includes not only mapped pages
but also pages used for the page tables themselves.
This avoids races with the MMU/other CPUs accessing pages after they've
been freed but before we've invalidated the TLB.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When SMP_ON_UP is used and the spinlocks are inlined, we end up with
inline spinlocks in the exit code, with references from the SMP
alternatives section to the exit sections. This causes link time
errors. Avoid this by placing the exit sections in the init-discarded
region.
Cc: <stable@kernel.org>
Tested-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Ensure a predictable endian state when entering signal handlers. This
avoids programs which use SETEND to momentarily switch their endian
state from having their signal handlers entered with an unpredictable
endian state.
Cc: <stable@kernel.org>
Acked-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit 18991197b4 added --build-id
linker option when toolchain supports it. ARM one does, but for some
reason places the section at 0 when linker script doesn't mention it
explicitly.
The 1e621a8e37 worked around the problem
removing this section from binary image with explicit objcopy options,
but it still exists in vmlinux, confusing tools like debuggers and perf.
This problem was discussed here:
http://lists.infradead.org/pipermail/linux-arm-kernel/2010-May/015994.htmlhttp://lists.infradead.org/pipermail/linux-arm-kernel/2010-May/015994.html
but the proposed changes to the linker script were substantial.
This patch simply places NOTES (36 bytes long, at least when compiled
with CodeSourcery toolchain) between data and bss, which seem to be
the right place (and suggested by the sample linker script in
include/asm-generic/vmlinux.lds.h).
It is enough to place it correctly in vmlinux (so debuggers are happy):
Section Headers:
[11] .data PROGBITS c07ce000 7ce000 020fc0 00 WA 0 0 32
[12] .notes NOTE c07eefc0 7eefc0 000024 00 AX 0 0 4
[13] .bss NOBITS c07ef000 7eefe4 01e628 00 WA 0 0 32
Program Headers:
LOAD 0x008000 0xc0008000 0xc0008000 0x7e6fe4 0x805628 RWE 0x8000
NOTE 0x7eefc0 0xc07eefc0 0xc07eefc0 0x00024 0x00024 R E 0x4
Section to Segment mapping:
Segment Sections...
00 <...> .data .notes .bss
01 .notes
and to get it exposed as /sys/kernel/notes used by perf tools.
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
SPEAR320_SOC_CONFIG_BASE was wrong, causing the wrong registers to be
accessed.
Reviewed-by: Stanley Miao <stanley.miao@windriver.com>
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
In sysctl_soft_reset(), switch to slow mode before resetting the system
via the system controller. This is required.
Reviewed-by: Stanley Miao <stanley.miao@windriver.com>
Signed-off-by: Shiraz Hashim <shiraz.hashim@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
readl() and writel() calls the outer cache maintainance operations
which are not available during Linux uncompression. This patch replaces
readl() and writel() with readl_relaxed() and writel_relaxed() to avoid
the link time errors.
Reviewed-by: Stanley Miao <stanley.miao@windriver.com>
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch fixes following warning:
arch/arm/mm/init.c:606: warning: format '%08lx' expects type 'long unsigned int', but argument 12 has type 'unsigned int'
by appending UL to VMALLOC_END's Number.
Reviewed-by: Stanley Miao <stanley.miao@windriver.com>
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The dependency is already expressed by the Makefiles, storing it in the
.cmd file breaks build if a .c file is replaced by .S or vice versa,
because the .cmd file contains
foo/bar.o: foo/bar.c ...
foo/bar.c ... :
so the foo/bar.c -> foo/bar.o rule triggers even if there is no
foo/bar.c anymore.
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Michal Marek <mmarek@suse.cz>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: HDA: Do not announce false surround in Conexant auto
ALSA: HDA: Conexant auto: Handle multiple connections to ADC node
ALSA: HDA: Add position_fix quirk for an Asus device
ALSA: caiaq - Fix possible string-buffer overflow
ALSA: au88x0 - Modify pointer callback to give accurate playback position
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging:
hwmon: (lm85) extend to support EMC6D103 chips
MAINTAINERS: Remove stale hwmon quilt tree
hwmon: (k10temp) add support for AMD Family 12h/14h CPUs
hwmon: (jc42) do not allow writing to locked registers
hwmon: (jc42) more helpful documentation
hwmon: (jc42) fix type mismatch
This reverts commit 9b29050f8f.
It has caused hibernate regressions, for example Juri Sladby's report:
"I'm unable to hibernate 2.6.37.1 unless I rmmod tpm_tis:
[10974.074587] Suspending console(s) (use no_console_suspend to debug)
[10974.103073] tpm_tis 00:0c: Operation Timed out
[10974.103089] legacy_suspend(): pnp_bus_suspend+0x0/0xa0 returns -62
[10974.103095] PM: Device 00:0c failed to freeze: error -62"
and Rafael points out that some of the new conditionals in that commit
seem to make no sense. This commit needs more work and testing, let's
revert it for now.
Reported-by: Norbert Preining <preining@logic.at>
Reported-and-requested-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Stefan Berger <stefanb@linux.vnet.ibm.com>
Cc: Guillaume Chazarain <guichaz@gmail.com>
Cc: Rajiv Andrade <srajiv@linux.vnet.ibm.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>