* git://kvm.qumranet.com/home/avi/kvm:
KVM: always reload segment selectors
KVM: Prevent system selectors leaking into guest on real->protected mode transition on vmx
This reverts commit 94985134b7 and
insteads removes the WARN_ON() that caused that commit in the first
place.
The problem is that we call disable_nonboot_cpus() in swsusp before
powering down the system in order to avoid triggering the WARN_ON()
in arch/x86_64/kernel/acpi/sleep.c:init_low_mapping() and this doesn't
work well on Thomas' system.
So instead, remove the WARN_ON() in arch/x86_64/kernel/acpi/sleep.c:
init_low_mapping(), which triggers every time during the suspend to disk
in the platform mode, as the potential problem it is related to doesn't
seem to occur in practice.
[ I think we might want to disallow the case of multiple users of that
mm, or something. Normally, playing with the current process page
tables on the current CPU should be fine as long as we don't have
other threads using those tables at the same time..
Anyway, not pretty, but better than the warning or the lockup - Linus ]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The clockevents / tick management code expects an error value, when the
event is already expired. hpet_next_event() returns 1 in that case.
Fix it to return the proper -ETIME error code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block:
Export __splice_from_pipe()
2/2 splice: dont readpage
1/2 splice: dont steal
make elv_register() output atomic
block: blk_max_pfn is somtimes wrong
Without attached patch against current -git I get following with
!PROC_SYSCTL (with EMBEDDED and PROC_FS set):
CC init/version.o
LD init/built-in.o
LD vmlinux
fs/built-in.o: In function `do_proc_sys_lookup':
proc_sysctl.c:(.text+0x26583): undefined reference to `sysctl_head_next'
fs/built-in.o: In function `proc_sys_revalidate':
proc_sysctl.c:(.text+0x265bb): undefined reference to `sysctl_head_finish'
fs/built-in.o: In function `proc_sys_readdir':
proc_sysctl.c:(.text+0x26720): undefined reference to `sysctl_head_next'
proc_sysctl.c:(.text+0x267d8): undefined reference to `sysctl_head_finish'
proc_sysctl.c:(.text+0x268e7): undefined reference to `sysctl_head_next'
proc_sysctl.c:(.text+0x26910): undefined reference to `sysctl_head_finish'
fs/built-in.o: In function `proc_sys_write':
proc_sysctl.c:(.text+0x2695d): undefined reference to `sysctl_perm'
proc_sysctl.c:(.text+0x2699c): undefined reference to `sysctl_head_finish'
fs/built-in.o: In function `proc_sys_read':
proc_sysctl.c:(.text+0x269e9): undefined reference to `sysctl_perm'
proc_sysctl.c:(.text+0x26a25): undefined reference to `sysctl_head_finish'
fs/built-in.o: In function `proc_sys_permission':
proc_sysctl.c:(.text+0x26ad1): undefined reference to `sysctl_perm'
proc_sysctl.c:(.text+0x26adb): undefined reference to `sysctl_head_finish'
fs/built-in.o: In function `proc_sys_lookup':
proc_sysctl.c:(.text+0x26b39): undefined reference to `sysctl_head_finish'
make: *** [vmlinux] Virhe 1
All those functions are in fs/proc/proc_sysctl.c, which has no CONFIG_
#define's in it, so the patch makes the compilation of that file to depend
on CONFIG_PROC_SYSCTL (the simplest choice).
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Because i don't have much time lately and my responses are pretty slow
it's probably best to remove me from MAINTAINERS to give someone else
the chance to jump in.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When CONFIG_IPC_NS=n, clone(CLONE_NEWIPC) claims success, but did not actually
clone a new IPC namespace.
Fix this to return -EINVAL so the caller knows his request was denied.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I2O subsystem has been broken in mainstream several months ago (after
2.6.18). Commit 4aff5e2333 from Jens
Axboe split struct request ->flags into two parts: cmd_type and
cmd_flags.
In i2o layer this patch has replaced flag REQ_SPECIAL by the according
cmd_type. However i2o has used REQ_SPECIAL not as command type but as
driver-specific flag for the debug purposes. As result all i2o requests
have type "special" now, are not processed to the hardware and fail with
I/O error:
i2o/hda:<3>Buffer I/O error on device i2o/hda, logical block 0
Buffer I/O error on device i2o/hda, logical block 0
Buffer I/O error on device i2o/hda, logical block 0
unable to read partition table
block-osm: device added (TID: 207): i2o/hda
The following patch removes the extra debug checks without any drawbacks and
restores the normal driver's work.
Signed-off-by: Vasily Averin <vvs@sw.ru>
Acked-by: Markus Lidel <Markus.Lidel@shadowconnect.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
WARNING: drivers/built-in.o - Section mismatch: reference to .init.text:eisa_root_register from .text between 'pci_eisa_init' (at offset 0xabf670) and 'virtual_eisa_release'
AFAIK a PCI to EISA bridge isn't anything hotpluggable, so
pci_eisa_init() can become __init.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I've been seeing some odd NTP behavior recently on a few boxes and
finally narrowed it down to time_offset overflowing when converted to
SHIFT_UPDATE units (which was a side effect from my HZfreeNTP patch).
This patch converts time_offset from a long to a s64 which resolves the
issue.
[tglx@linutronix.de: signedness fixes]
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch uses MAX_REG_NR consistently to refer to the register file size.
FRAME_SIZE isn't sufficient because on x86_64, it is smaller than the
ptrace register file size. MAX_REG_NR was introduced as a consistent way
to get the number of registers, but wasn't used everywhere it should be.
When this causes a problem, it makes PTRACE_SETREGS fail on x86_64 because
of a corrupted segment register value in the known-good register file. The
patch also adds a register dump at that point in case there are any future
problems here.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
During a static link, ld has started putting a .note section in the
.uml.setup.init section. This has the result that the UML setups begin
with 32 bytes of garbage and UML crashes immediately on boot.
This patch creates a specific .note section for ld to drop this stuff
into.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
WARNING: drivers/built-in.o - Section mismatch: reference to .init.text:spi_register_master from .text between 'spi_bitbang_start' (at offset 0x84e11a) and 'bitbang_work'
WARNING: drivers/built-in.o - Section mismatch: reference to .init.text:spi_alloc_master from .text between 'butterfly_attach' (at offset 0x84e681) and 'at25_remove'
WARNING: drivers/built-in.o - Section mismatch: reference to .init.text:spi_new_device from .text between 'butterfly_attach' (at offset 0x84e7e4) and 'at25_remove'
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When CONFIG_UTS_NS=n, clone(CLONE_NEWUTS) quietly refuses. So correctly does
not unshare a new uts namespace, but also does not return -EINVAL.
Fix this to return -EINVAL so the caller knows his request was denied.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Its now used.. because we added the new definitions so enabled all the
goodies on i386
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
UML/x86_64 needs the same packing of struct epoll_event as x86_64.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On Bob's machine clocksource is selecting PIT over the ACPI PM timer,
because he has the PIIX4 bug. That bug drops the ACPI PM timers rating
to the same as the PIT, so that's why you're getting the PIT.
Realistically, the PIT is much slower then even the triple read ACPI PM,
so the de-ranking code is probably dropping it too far.
So don't drop ACPI PM quite so low if we see the PIIX4 bug.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Bob Tracy <rct@gherkin.frus.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit d720bc4b8f partially removed a
private implementation of baud speed decoding. However it doesn't seem
to be complete: after the speed is decoded, it is still being used as an
index to a local speed table (array overrun, no doubt).
This was found by Graham Murray who noticed it caused a 2.6.19 regression
with the SX driver: https://bugs.gentoo.org/170554
Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
... still not sure why we need this ....
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If this mddev and queue got reused for another array that doesn't register a
congested_fn, this function would get called incorretly.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All that is missing the the function pointers in raid4_pers.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This cancel_delayed_work call is called from a function that is only called
from a piece of code that immediate follows a cancel and destruction of the
workqueue, so it's clearly a mistake.
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The reused clientid here is a more of a problem for the client than the
server, and the client can report the problem itself if it's serious.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A regression introduced in the last set of acl patches removed the
INHERIT_ONLY flag from aces derived from the posix acl. Fix.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
->readdir passes lofft_t offsets (used as nfs cookies) to
nfs3svc_encode_entry{,_plus}, but when they pass it on to encode_entry it
becomes an 'off_t', which isn't good.
So filesystems that returned 64bit offsets would lose.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ingo reported it on lkml in the thread
"2.6.21-rc5: maxcpus=1 crash in cpufreq: kernel BUG at drivers/cpufreq/cpufreq.c:82!"
This check added to remove_dev is symmetric to one in add_dev and handles
callbacks for offline cpus cleanly.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
failed VM entry on VMX might still change %fs or %gs, thus make sure
that KVM always reloads the segment selectors. This is crutial on both
x86 and x86_64: x86 has __KERNEL_PDA in %fs on which things like
'current' depends and x86_64 has 0 there and needs MSR_GS_BASE to work.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Intel virtualization extensions do not support virtualizing real mode. So
kvm uses virtualized vm86 mode to run real mode code. Unfortunately, this
virtualized vm86 mode does not support the so called "big real" mode, where
the segment selector and base do not agree with each other according to the
real mode rules (base == selector << 4).
To work around this, kvm checks whether a selector/base pair violates the
virtualized vm86 rules, and if so, forces it into conformance. On a
transition back to protected mode, if we see that the guest did not touch
a forced segment, we restore it back to the original protected mode value.
This pile of hacks breaks down if the gdt has changed in real mode, as it
can cause a segment selector to point to a system descriptor instead of a
normal data segment. In fact, this happens with the Windows bootloader
and the qemu acpi bios, where a protected mode memcpy routine issues an
innocent 'pop %es' and traps on an attempt to load a system descriptor.
"Fix" by checking if the to-be-restored selector points at a system segment,
and if so, coercing it into a normal data segment. The long term solution,
of course, is to abandon vm86 mode and use emulation for big real mode.
Signed-off-by: Avi Kivity <avi@qumranet.com>
After freeing a block there should be no reference to this block.
Signed-off-by: Thomas Viehweger <Thomas.Viehweger@marconi.com>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Olaf Hering pointed out that SAA7146_CLIPPING_MEM would become
very large for PAGE_SIZE > 4K.
In fact, the number of clipping windows is limited to 16,
and calculate_clipping_registers_rect() does not use more
than 256 bytes. SAA7146_CLIPPING_MEM adjusted accordingly.
Thanks-to: Olaf Hering <olaf@aepfle.de>
Acked-by: Michael Hunold <hunold@linuxtv.org>
Signed-off-by: Oliver Endriss <o.endriss@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Returning -1 causes the probe to stop, but it should just continue
instead.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Fix several instances of dvb-core functions using mutex_lock_interruptible
and returning -ERESTARTSYS where the calling function will either never
retry or never check the return value.
These cause a race condition with dvb_dmxdev_filter_free and
dvb_dvr_release, both of which are filesystem release functions whose
return value is ignored and will never be retried. When this happens it
becomes impossible to open dvr0 again (-EBUSY) since it has not been
released properly.
Signed-off-by: Simon Arlott <simon@fire.lp0.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-By: Johannes Stezenbach <js@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
All the radio drivers need video_dev, but they were depending on
VIDEO_DEV!=n. That meant that one could try to compile the driver into
the kernel when VIDEO_DEV=m, which will not work. If video_dev is a
module, then the radio drivers must be modules too.
Signed-off-by: Trent Piepho <xyzzy@speakeasy.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Ocfs2 wants to implement it's own splice write actor so that it can better
manage cluster / page locks. This lets us re-use the rest of splice write
while only providing our own code where it's actually important.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Splice does not need to readpage to bring the page uptodate before writing
to it, because prepare_write will take care of that for us.
Splice is also wrong to SetPageUptodate before the page is actually uptodate.
This results in the old uninitialised memory leak. This gets fixed as a
matter of course when removing the readpage logic.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Stealing pages with splice is problematic because we cannot just insert
an uptodate page into the pagecache and hope the filesystem can take care
of it later.
We also cannot just ClearPageUptodate, then hope prepare_write does not
write anything into the page, because I don't think prepare_write gives
that guarantee.
Remove support for SPLICE_F_MOVE for now. If we really want to bring it
back, we might be able to do so with a the new filesystem buffered write
aops APIs I'm working on. If we really don't want to bring it back, then
we should decide that sooner rather than later, and remove the flag and
all the stealing infrastructure before anybody starts using it.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Booting 2.6.21-rc3-g45592145 I noticed the following on one of my
machines in the bootlog:
io scheduler noop registered<6>Time: jiffies clocksource has been installed.
io scheduler deadline registered (default)
Looking at block/elevator.c, it appears that elv_register() uses two
consecutive printks in a non-atomic way, leading to the above glitch. The
attached trivial patch fixes this issue, by using a single printk.
Signed-off-by: Thibaut VARENE <varenet@parisc-linux.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
There is a small problem in handling page bounce.
At the moment blk_max_pfn equals max_pfn, which is in fact not maximum
possible _number_ of a page frame, but the _amount_ of page frames. For
example for the 32bit x86 node with 4Gb RAM, max_pfn = 0x100000, but not
0xFFFF.
request_queue structure has a member q->bounce_pfn and queue needs bounce
pages for the pages _above_ this limit. This routine is handled by
blk_queue_bounce(), where the following check is produced:
if (q->bounce_pfn >= blk_max_pfn)
return;
Assume, that a driver has set q->bounce_pfn to 0xFFFF, but blk_max_pfn
equals 0x10000. In such situation the check above fails and for each bio
we always fall down for iterating over pages tied to the bio.
I want to notice, that for quite a big range of device drivers (ide, md,
...) such problem doesn't happen because they use BLK_BOUNCE_ANY for
bounce_pfn. BLK_BOUNCE_ANY is defined as blk_max_pfn << PAGE_SHIFT, and
then the check above doesn't fail. But for other drivers, which obtain
reuired value from drivers, it fails. For example sata_nv uses
ATA_DMA_MASK or dev->dma_mask.
I propose to use (max_pfn - 1) for blk_max_pfn. And the same for
blk_max_low_pfn. The patch also cleanses some checks related with
bounce_pfn.
Signed-off-by: Vasily Tarasov <vtaras@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
[S390] zcrypt: Fix ap_poll_requests counter in lost requests error path.
[S390] zcrypt: Fix possible dead lock in AP bus module.
[S390] cio: Device status validity.
[S390] kprobes: Align probe address.
[S390] Fix TCP/UDP pseudo header checksum computation.
[S390] dasd: Work around gcc bug.
This patch implements set_mac_address for the sungem driver. This
allows changing the mac address of the interface, even when the
interface is up.
Signed-off-by: Ruben Vandeginste <snowbender@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6:
USB: fix usb-serial/ftdi build warning
USB: fix usb-serial/generic build warning
USB: another entry for the quirk list
USB: remove duplicated device id in airprime driver
USB: omap_udc: workaround dma_free_coherent() bogosity
UHCI: Fix problem caused by lack of terminating QH
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6:
PCI: Fix warning message in PCIE port driver
PCI: Stop unhiding the SMBus on Toshiba laptops
PCI: Fix up PCI power management doc
pci: set pci=bfsort for PowerEdge R900