kernel_optimize_test/drivers
Michael S. Tsirkin 3a4d5c94e9 vhost_net: a kernel-level virtio server
What it is: vhost net is a character device that can be used to reduce
the number of system calls involved in virtio networking.
Existing virtio net code is used in the guest without modification.

There's similarity with vringfd, with some differences and reduced scope
- uses eventfd for signalling
- structures can be moved around in memory at any time (good for
  migration, bug work-arounds in userspace)
- write logging is supported (good for migration)
- support memory table and not just an offset (needed for kvm)

common virtio related code has been put in a separate file vhost.c and
can be made into a separate module if/when more backends appear.  I used
Rusty's lguest.c as the source for developing this part : this supplied
me with witty comments I wouldn't be able to write myself.

What it is not: vhost net is not a bus, and not a generic new system
call. No assumptions are made on how guest performs hypercalls.
Userspace hypervisors are supported as well as kvm.

How it works: Basically, we connect virtio frontend (configured by
userspace) to a backend. The backend could be a network device, or a tap
device.  Backend is also configured by userspace, including vlan/mac
etc.

Status: This works for me, and I haven't see any crashes.
Compared to userspace, people reported improved latency (as I save up to
4 system calls per packet), as well as better bandwidth and CPU
utilization.

Features that I plan to look at in the future:
- mergeable buffers
- zero copy
- scalability tuning: figure out the best threading model to use

Note on RCU usage (this is also documented in vhost.h, near
private_pointer which is the value protected by this variant of RCU):
what is happening is that the rcu_dereference() is being used in a
workqueue item.  The role of rcu_read_lock() is taken on by the start of
execution of the workqueue item, of rcu_read_unlock() by the end of
execution of the workqueue item, and of synchronize_rcu() by
flush_workqueue()/flush_work(). In the future we might need to apply
some gcc attribute or sparse annotation to the function passed to
INIT_WORK(). Paul's ack below is for this RCU usage.

(Includes fixes by Alan Cox <alan@linux.intel.com>,
David L Stevens <dlstevens@us.ibm.com>,
Chris Wright <chrisw@redhat.com>)

Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-01-15 01:43:29 -08:00
..
accessibility drop explicit include of autoconf.h 2009-12-12 13:08:15 +01:00
acpi Merge branch 'misc-2.6.33' into release 2009-12-16 14:22:32 -05:00
amba
ata pata_bf54x: handle portmuxing of pins through GPIO PORTs 2009-12-21 13:55:38 -05:00
atm drivers/atm/lanai.c: use %pM to show MAC address 2010-01-07 01:13:57 -08:00
auxdisplay
base devtmpfs: unlock mutex in case of string allocation error 2009-12-23 11:23:44 -08:00
block Merge branch 'for-2.6.33' of git://git.kernel.dk/linux-2.6-block 2009-12-15 09:11:28 -08:00
bluetooth Bluetooth: Prevent ill-timed autosuspend in USB driver 2009-12-17 12:12:49 -08:00
cdrom Merge git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 2009-12-09 19:03:16 -08:00
char kfifo: fix warn_unused_result 2009-12-22 14:17:56 -08:00
clocksource cs5535: add a generic clock event MFGPT driver 2009-12-15 08:53:28 -08:00
connector
cpufreq Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2009-12-14 09:58:24 -08:00
cpuidle drivers/cpuidle: Move dereference after NULL test 2009-12-15 08:53:25 -08:00
crypto Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2009-12-14 09:58:24 -08:00
dca
dio
dma Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx 2009-12-16 10:28:56 -08:00
edac Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp 2009-12-16 10:09:43 -08:00
eisa
firewire Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 2009-12-11 15:22:27 -08:00
firmware drivers/firmware/iscsi_ibft.c: use %pM to show MAC address 2010-01-07 01:13:56 -08:00
gpio gpiolib: add support for changing value polarity in sysfs 2009-12-16 07:20:01 -08:00
gpu Merge branch 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 2009-12-23 08:59:32 -08:00
hid drop explicit include of autoconf.h 2009-12-12 13:08:15 +01:00
hwmon Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging 2009-12-17 16:48:08 -08:00
i2c const: constify remaining dev_pm_ops 2009-12-15 08:53:25 -08:00
ide Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc 2009-12-12 14:27:24 -08:00
idle cpumask: convert drivers/idle/i7300_idle.c to cpumask_var_t 2009-12-17 11:43:25 +10:30
ieee1394 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2009-12-09 19:43:33 -08:00
ieee802154
infiniband drivers/infiniband/hw/cxgb3/iwch_cm.c: use %pM to show MAC address 2010-01-07 01:17:27 -08:00
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2009-12-16 10:31:44 -08:00
isdn proc_fops: convert drivers/isdn/ to seq_file 2010-01-14 03:10:54 -08:00
leds leds: leds-pwm: Set led_classdev max_brightness 2009-12-17 11:42:34 +00:00
lguest Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2009-12-14 09:58:24 -08:00
macintosh Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc 2009-12-12 14:27:24 -08:00
mca
md Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm 2009-12-15 09:12:01 -08:00
media drivers/media/dvb/dvb-core/dvb_net.c: use %pM to show MAC address 2010-01-07 01:13:58 -08:00
memstick
message drivers/message/i2o/i2o_proc.c: use %pM to show MAC address 2010-01-07 01:18:23 -08:00
mfd mfd: compile fix for twl4030 renaming 2009-12-15 09:33:36 -08:00
misc iwmc3200top: simplify imwct_tx 2009-12-23 14:13:32 -08:00
mmc sdhci-of: add support for the wii sdhci controller 2009-12-17 15:45:32 -08:00
mtd Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus 2009-12-17 16:38:06 -08:00
net tun: export underlying socket 2010-01-15 01:43:28 -08:00
nubus
of
oprofile Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2009-12-14 09:58:24 -08:00
parisc parisc: Fixup last users of irq_chip->typename 2009-12-16 03:48:56 +00:00
parport parport_pc.c: use correct length in strncmp 2009-12-16 07:20:12 -08:00
pci Merge git://git.infradead.org/iommu-2.6 2009-12-16 10:11:38 -08:00
pcmcia PCMCIA: fix pxa2xx_lubbock modular build error 2009-12-16 20:11:02 +00:00
platform kfifo: rename kfifo_put... into kfifo_in... and kfifo_get... into kfifo_out... 2009-12-22 14:17:56 -08:00
pnp Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 2009-12-16 12:33:19 -08:00
power Merge git://git.infradead.org/battery-2.6 2009-12-15 08:59:33 -08:00
pps
ps3
rapidio
regulator regulator: wm831x_reg_read() failure unnoticed in wm831x_aldo_get_mode() 2009-12-17 10:27:30 +00:00
rtc Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus 2009-12-17 16:38:06 -08:00
s390 qeth: default BLKT values for new OSA/3 hardware 2010-01-13 20:34:57 -08:00
sbus
scsi Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-01-10 22:55:03 -08:00
serial Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-01-10 22:55:03 -08:00
sfi
sh
sn ioc3/ioc4: fix error path on driver registration 2009-12-15 08:53:27 -08:00
spi Merge branch 'next-spi' of git://git.secretlab.ca/git/linux-2.6 2009-12-17 15:59:05 -08:00
ssb
staging Staging/vt66*: kconfig, depends on WLAN 2009-12-23 11:27:50 -08:00
tc
telephony
thermal Merge branch 'misc-2.6.33' into release 2009-12-16 14:22:32 -05:00
uio const: constify remaining dev_pm_ops 2009-12-15 08:53:25 -08:00
usb USB: Fix a bug on appledisplay.c regarding signedness 2009-12-23 11:34:20 -08:00
uwb
vhost vhost_net: a kernel-level virtio server 2010-01-15 01:43:29 -08:00
video Merge branch 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6 2009-12-17 16:57:49 -08:00
virtio
vlynq
w1
watchdog watchdog: update geodewdt for new MFGPT API 2009-12-18 10:19:57 -08:00
xen Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 2009-12-11 12:18:16 -08:00
zorro
Kconfig cs5535: add a generic clock event MFGPT driver 2009-12-15 08:53:28 -08:00
Makefile vhost_net: a kernel-level virtio server 2010-01-15 01:43:29 -08:00