Differentiated I/O Services in Virtualized Environments: Tyler Harter, Salini SK & Anand Krishnamurthy

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 44

Differentiated I/O services in

virtualized environments
Tyler Harter, Salini SK & Anand Krishnamurthy
1
Overview
Provide differentiated I/O services for applications in
guest operating systems in virtual machines
Applications in virtual machines tag I/O requests
Hypervisors I/O scheduler uses these tags to provide
quality of I/O service

2
Motivation
Variegated applications with different I/O requirements
hosted in clouds
Not optimal if I/O scheduling is agnostic of the
semantics of the request

3
Motivation
4
Hypervisor
VM 1
VM 2 VM 3
Motivation
5
Hypervisor
VM
2
VM
3
Motivation
We want to have high and low priority processes that
correctly get differentiated service within a VM and
between VMs
6

Can my webserver/DHT log pushers IO
be served differently
from my webserver/DHTs IO?
Existing work & Problems
Vmwares ESX server offers Storage I/O Control (SIOC)
Provides I/O prioritization of virtual machines that
access a shared storage pool
7
But it supports prioritization only at host granularity!
Existing work & Problems
Xen credit scheduler also works at domain level

Linuxs CFQ I/O scheduler supports I/O prioritization

Possible to use priorities at both guest and hypervisors I/O
scheduler

8
Original Architecture
9
QEMU Virtual
SCSI Disk
Syscalls
I/O Scheduler
(e.g., CFQ)
Syscalls
I/O Scheduler
(e.g., CFQ)
Guest
VMs
Host
High Low High Low
Original Architecture
10
Problem 1: low and high may get same service
11
Problem 2: does not utilize host caches
12
Existing work & Problems
Xen credit scheduler also works at domain level

Linuxs CFQ I/O scheduler supports I/O prioritization

Possible to use priorities at both guest and hypervisors I/O
scheduler

Current state of the art doesnt provide differentiated
services at guest application level granularity

13
Solution
14
Tag I/O and prioritize in the hypervisor

Outline
KVM/Qemu, a brief intro
KVM/Qemu I/O stack
Multi-level I/O tagging
I/O scheduling algorithms
Evaluation
Summary


15
KVM/Qemu, a brief intro..
16
Hardware
Linux Standard Kernel with KVM - Hypervisor
KVM module part of
Linux kernel since
version 2.6
Linux has all the
mechanisms a VMM
needs to operate
several VMs.
Has 3 modes:- kernel,
user, guest
kernel-mode: switch into guest-mode and handle
exits due to I/O operations
user-mode: I/O when guest needs to access
devices
guest-mode: execute guest code, which is the
guest OS except I/O
Relies on a
virtualization capable
CPU with either Intel
VT or AMD SVM
extensions
KVM/Qemu, a brief intro..
17
Hardware
Linux Standard Kernel with KVM - Hypervisor
KVM module part of
Linux kernel since
version 2.6
Linux has all the
mechanisms a VMM
needs to operate
several VMs.
Has 3 modes:- kernel,
user, guest
kernel-mode: switch into guest-mode and handle
exits due to I/O operations
user-mode: I/O when guest needs to access
devices
guest-mode: execute guest code, which is the
guest OS except I/O
Relies on a
virtualization capable
CPU with either Intel
VT or AMD SVM
extensions
KVM/Qemu, a brief intro..
18
Hardware
Linux Standard Kernel with KVM - Hypervisor
Each Virtual Machine is
an user space process
KVM/Qemu, a brief intro..
19
Hardware
Linux Standard Kernel with KVM - Hypervisor
libvirt
Other
user
space
ps
KVM/Qemu I/O stack
Application in
guest OS
Application in
guest OS
System calls layer
read, write, stat ,
VFS
FileSystem
BufferCache
Block
SCSI
ATA
Issues an I/O-related system call
(eg: read(), write(), stat()) within
a user-space context of the
virtual machine.
This system call will lead to
submitting an I/O request from
within the kernel-space of the
VM
The I/O request will reach a device
driver - either an ATA-compliant
(IDE) or SCSI
KVM/Qemu I/O stack
Application in
guest OS
Application in
guest OS
System calls layer
read, write, stat ,
VFS
FileSystem
BufferCache
Block
SCSI
ATA
The device driver will issue privileged
instructions to read/write to the
memory regions exported over PCI by
the corresponding device
KVM/Qemu I/O stack
Hardware
Linux Standard Kernel with KVM - Hypervisor
These instructions will trigger VM-exits, that
will be handled by the core
KVM module within the Host's kernel-space
context
Qemu
emulator
The privileged I/O related
instructions are passed by the hypervisor to
the QEMU machine emulator
A VM-exit will take place for each of the
privileged
instructions resulting from the original I/O
request in the VM
KVM/Qemu I/O stack
Hardware
Linux Standard Kernel with KVM - Hypervisor
Qemu
emulator
These instructions will then be
emulated by device-controller emulation
modules within QEMU (either as ATA or as
SCSI commands)
QEMU will generate block-access I/O
requests, in a special blockdevice
emulation module
Thus the original I/O request will generate
I/O requests to the kernel-space of the Host
Upon completion of the system calls, qemu
will "inject" an interrupt into the VM that
originally issued the I/O request
Multi-level I/O tagging modifications
Modification 1: pass priorities via
syscalls
Modification 2: NOOP+ at guest I/O scheduler
Modification 3: extend SCSI protocol with
prio
Modification 2: NOOP+ at guest I/O scheduler
Modification 4: share-based prio sched in
host
Modification 5: use new calls in
benchmarks
Scheduler algorithm-Stride

- ID of application

= Shares assigned to

Virtual IO counter for

= Global_shares/



Dispatch request()
{
Select the ID which has lowest Virtual IO counter
Increase

by


if (

reaches threshold)
Reinitialize all

to 0
Dispatch request in the queue
}



31
Scheduler algorithm cntd
Problem: Sleeping process can monopolize the resource
once it wakes up after a long time
Solution:
If a sleeping process k wakes up, then set

= max( min(all

which are non zero),

)



32
Evaluation
Tested on HDD and SSD
Configuration:











33
Guest RAM size 1GB
Host RAM size 8GB
Hard disk RPM 7200
SSD 35000 IOPS Rd, 85000 IOPS
Wr
Guest OS Ubuntu Server 12.10 LK 3.2
Host OS Kubuntu 12.04 LK 3.2
Filesystem(Host/Guest) Ext4
Virtual disk image format qcow2
Results
Metrics:
Throughput
Latency
Benchmarks:
Filebench
Sysbench
Voldemort(Distributed Key Value Store)
34
Shares vs Throughput for different workloads : HDD
35
Shares vs Latency for different workloads : HDD
36
Priorities are
better
respected if
most of the read
request hits the
disk



Effective Throughput for various dispatch numbers : HDD
37
Priorities are
respected only when
dispatch numbers of
the disk is lower than
the number of read
requests generated
by the system at a
time
Downside: Dispatch
number of the disk is
directly proportional
to the effective
throughput





Shares vs Throughput for different workloads : SSD
38
Shares vs Latency for different workloads : SSD
39
Priorities in
SSDs are
respected only
under heavy
load, since
SSDs are faster



Comparison b/w different schedulers
40
Only Noop+LKMS respects priority! (Has to be, since we did it)
Results
Hard
drive/SSD
Webserver Mailserver

Random
Reads
Sequential
Reads
Voldemort
DHT Reads
Hard disk
Flash
41
Summary
It works!!!
Preferential services are possible only when dispatch
numbers of the disk is lower than the number of read
requests generated by the system at a time
But lower dispatch number reduces the effective throughput
of the storage
In SSD, preferential service is only possible under heavy load
Scheduling at the lowermost layer yields better
differentiated services
42
Future work
Get it working for writes
Get evaluations on VMware ESX SIOC and compare with
our results




43
44

You might also like