VIRTUALIZATION
VIRTUALIZATION
VIRTUALIZATION
VIRTUALIZATION
1
Instruction Set Architecture Level
At the ISA level, virtualization is performed by emulating a given ISA by the ISA of the host
machine. The basic emulation method is through code interpretation. An interpreter program interprets
the source instructions to target instructions one by one. One source instruction may require tens or
hundreds of native target instructions to perform its function. Obviously, this process is relatively slow.
For better performance, dynamic binary translation is desired. This approach translates basic blocks of
dynamic source instructions to target instructions. The basic blocks can also be extended to program
traces or super blocks to increase translation efficiency. A virtual instruction set architecture (V-ISA)
thus requires adding a processor-specific software translation layer to the compiler.
Hardware Abstraction Level
It is performed right on top of the bare hardware and generates a virtual hardware environment
for a VM. On the other hand, the process manages the underlying hardware through virtualization. The
idea is to virtualize a computer’s resources, such as its processors, memory, and I/O devices so as
hardware utilization rate by multiple users concurrently may be upgraded
Operating System Level
OS-level virtualization creates isolated containers on a single physical server and the OS
instances to utilize the hardware and software in data centers. The containers behave like real servers.
OS-level virtualization is commonly used in creating virtual hosting environments to allocate hardware
resources among a large number of mutually distrusting users.
Library Support Level
Virtualization with library interfaces is possible by controlling the communication link between
applications and the rest of a system through API hooks. The software tool WINE has implemented this
approach to support Windows applications on top of UNIX hosts. Another example is the vCUDA
which allows applications executing within VMs to leverage GPU hardware acceleration.
User-Application Level
On a traditional OS, an application often runs as a process. Therefore, application-level
virtualization is also known as process-level virtualization. The most popular approach is to deploy
high level language (HLL) VMs. In this scenario, the virtualization layer exports an abstraction of a
VM that can run programs written and compiled to a particular abstract machine definition. Any
program written in the HLL and compiled for this VM will be able to run on it. The Microsoft .NET
CLR and Java Virtual Machine (JVM) are two good examples of this class of VM. Other forms of
application-level virtualization are known as application isolation, application sandboxing, or
application streaming. The process involves wrapping the application in a layer that is isolated from the
host OS and other applications. The result is an application that is much easier to distribute and remove
from user workstations.
Virtualization Support at the OS Level
It is slow to initialize a hardware-level VM because each VM creates its own image from
scratch and storage of such images are also slow. OS-level virtualization provides a feasible solution
for these hardware-level virtualization issues. OS virtualization inserts a virtualization layer inside an
operating system to partition a machine’s physical resources. It enables multiple isolated VMs within a
single operating system kernel. This kind of VM is often called a virtual execution environment (VE).
2
This VE has its own set of processes, file system, user accounts, network interfaces with IP addresses,
routing tables, firewall rules, and other personal settings.
Advantages:
VMs at the operating system level have minimal startup/shutdown costs, low resource
requirements, and high Scalability
It is possible for a VM and its host environment to synchronize state changes when necessary.
Like other virtualization systems, many guest OSes can run on top of the hypervisor. The guest OS
(privileged guest OS), which has control ability, is called Domain 0, and the others are called Domain
U. It is first loaded when Xen boots without any file system drivers being available. Domain 0 is
designed to access hardware directly and manage devices.
Binary Translation with Full Virtualization
3
Depending on implementation technologies, hardware virtualization can be classified into two
categories: full virtualization and host-based virtualization.
Full Virtualization
With full virtualization, noncritical instructions run on the hardware directly while critical
instructions are discovered and replaced with traps into the VMM to be emulated by software. Both the
hypervisor and VMM approaches are considered full virtualization. Noncritical instructions do not
control hardware or threaten the security of the system, but critical instructions do. Therefore, running
noncritical instructions on hardware not only can promote efficiency, but also can ensure system
security.
Host-Based Virtualization
An alternative VM architecture is to install a virtualization layer on top of the host OS. This
host OS is still responsible for managing the hardware. The guest OSes are installed and run on top of
the virtualization layer. Dedicated applications may run on the VMs. Certainly, some other applications
can also run with the host OS directly. This host based architecture has some distinct advantages, as
enumerated next. First, the user can install this VM architecture without modifying the host OS.
Second, the host-based approach appeals to many host machine configurations.
Para-Virtualization
It needs to modify the guest operating systems. A para-virtualized VM providesspecial APIs
requiring substantial OS modifications in user applications. Performance degradation is a critical issue
of a virtualized system. Figure illustrates the concept of a para-virtualized VM architecture. The guest
OS are para-virtualized. They are assisted by an intelligent compiler to replace the nonvirtualizable OS
instructions by hypercalls. The traditional x86 processor offers four instruction execution rings: Rings
0, 1, 2, and 3. The lower the ring number, the higher the privilege of instruction being executed. The
OS is responsible for managing the hardware and the privileged instructions to execute at Ring 0, while
user-level applications run at Ring 3.
4
Although para-virtualization reduces the overhead, it has incurred problems like compatibility and
portability, because it must support the unmodified OS as well. Second, the cost is high, because they
may require deep OS kernel modifications. Finally, the performance advantage of para-virtualization
varies greatly due to workload variations.
5
Memory Virtualization
Virtual memory virtualization is similar to the virtual memory support provided by modern
operating systems. In a traditional environment, the OS maintains page table for mappings of virtual
memory to machine memory, which is a one-stage mapping. All modern x86 CPUs include a memory
management unit (MMU) and a translation lookaside buffer (TLB) to optimize virtual memory
performance. However, in a virtual execution environment, virtual memory virtualization involves
sharing the physical system memory in RAM and dynamically allocating it to the physical memory of
the VMs. A two-stage mapping process should be maintained by the guest OS and the VMM,
respectively: virtual memory to physical memory and physical memory to machine memory. The
VMM is responsible for mapping the guest physical memory to the actual machine memory in guest
OS.
Since each page table of the guest OSes has a separate page table in the VMM corresponding to
it, the VMM page table is called the shadow page table. VMware uses shadow page tables to perform
virtual-memory-to-machine-memory address translation. Processors use TLB hardware to map the
virtual memory directly to the machine memory to avoid the two levels of translation on every access.
When the guest OS changes the virtual memory to a physical memory mapping, the VMM updates the
shadow page tables to enable a direct lookup.
I/O virtualization
It involves managing the routing of I/O requests between virtual devices and the shared physical
hardware. There are three ways to implement I/O virtualization:
Full device emulation
Para-virtualization
6
Direct I/O.
Full device emulation
All the functions of a device like device enumeration, identification, interrupts, and DMA, are
replicated in software and it is located in the VMM and acts as a virtual device. The I/O access requests
of the guest OS are trapped in the VMM which interacts with the I/O devices.
Para-virtualization
It is a split driver model consisting of a frontend driver and a backend driver. The frontend
driver is running in Domain U and the backend driver is running in Domain 0. They interact with each
other via a block of shared memory. The frontend driver manages the I/O requests of the guest OSes
and the backend driver is responsible for managing the real I/O devices and multiplexing the I/O data
of different VMs. Although para-I/O-virtualization achieves better device performance than full device
emulation, it comes with a higher CPU overhead.
Direct I/O virtualization
It lets the VM access devices directly. It can achieve close-to-native performance without high
CPU costs. However, current direct I/O virtualization implementations focus on networking for
mainframes.
Another way to help I/O virtualization is via self-virtualized I/O (SV-IO). The key idea is to
harness the rich resources of a multicore processor. All tasks associated with virtualizing an I/O device
are encapsulated in SV-IO. SV-IO defines one virtual interface (VIF) for every kind of virtualized I/O
device, such as virtual network interfaces, virtual block devices (disk), virtual camera devices, and
others. The guest OS interacts with the VIFs via VIF device drivers. Each VIF consists of two message
queues. One is for outgoing messages to the devices and the other is for incoming messages from the
devices. In addition, each VIF has a unique ID for identifying it in SV-IO.
Virtualization in Multi-Core Processors
Multicore processors are claimed to have higher performance by integrating multiple processor
cores in a single chip, muti-core virtualiuzation has raised some new challenges to computer architects,
compiler constructors, system designers, and application programmers. Application programs must be
parallelized to use all cores fully, and software must explicitly assign tasks to the cores, which is a very
complex problem. Concerning the first challenge, new programming models, languages, and libraries
are needed to make parallel programming easier. The second challenge has spawned research involving
scheduling algorithms and resource management policies
Virtual Hierarchy
A virtual hierarchy is a cache hierarchy that can adapt to fit the workload or mix of workloads.
The hierarchy’s first level locates data blocks close to the cores needing them for faster access,
7
establishes a shared-cache domain, and establishes a point of coherence for faster communication. The
first level can also provide isolation between independent workloads. A miss at the L1 cache can
invoke the L2 access.
The following figure illustrates a logical view of such a virtual cluster hierarchy in two
levels. Each VM operates in a isolated fashion at the first level which minimize both miss access time
and performance interference with other workloads or VMs. The second level maintains a globally
shared memory facilitates dynamically repartitioning resources without costly cache flushes.
VIRTUAL CLUSTERS AND RESOURCE MANAGEMENT
A physical cluster is a collection of servers interconnected by a physical network such as a LAN
whereas virtual clusters have VMs that are interconnected logically by a virtual network across several
physical networks. We will study three critical design issues of virtual clusters: live migration of VMs,
memory and file migrations, and dynamic deployment of virtual clusters.
VMs in virtual cluster have the following interesting properties:
The virtual cluster nodes can be either physical or virtual machines
The purpose of using VMs is to consolidate multiple functionalities on the same server
VMs can be colonized (replicated) in multiple servers for the purpose of promoting distributed
parallelism, fault tolerance, and disaster recovery
The size (number of nodes) of a virtual cluster can grow or shrink dynamically
The failure of any physical nodes may disable some VMs installed on the failing nodes. But the
failure of VMs will not pull down the host system.
Virtual cluster based on application partitioning or customization. The most important thing is
to determine how to store those images in the system efficiently. There are common installations for
most users or applications, such as operating systems or user-level programming libraries. These
software packages can be preinstalled as templates (called template VMs). With these templates, users
can build their own software stacks. New OS instances can be copied from the template VM.
Fast Deployment and Effective Scheduling
Deployment means two things: to construct and distribute software stacks (OS, libraries,
applications) to a physical node inside clusters as fast as possible, and to quickly switch runtime
environments from one user’s virtual cluster to another user’s virtual cluster. If one user finishes using
his system, the corresponding virtual cluster should shut down or suspend quickly to save the resources
to run other VMs for other users
High-Performance Virtual Storage
Basically, there are four steps to deploy a group of VMs onto a target cluster: preparing the disk
image, configuring the VMs, choosing the destination nodes, and executing the VM deployment
8
command on every host. Many systems use templates to simplify the disk image preparation process. A
template is a disk image that includes a preinstalled operating system with or without certain
application software. Templates could implement the COW (Copy on rite) format. A new COW
backup file is very small and easy to create and transfer.
Live VM Migration Steps
There are four ways to manage a virtual cluster. First, we can use a guest-based manager, by
which the cluster manager resides on a guest system. In this case, multiple VMs form a virtual cluster.
We can build a cluster manager on the host systems. The host-based manager supervises the guest
systems and can restart the guest system on another physical machine. Third way to manage a virtual
cluster is to use an independent cluster manager on both the host and guest systems. Finally, you can
use an integrated cluster on the guest and host systems. This means the manager must be designed to
distinguish between virtualized resources and physical resources.
A VM can be in one of the following four states.
An inactive state is defined by the virtualization platform, under which the VM is not enabled.
An active state refers to a VM that has been instantiated at the virtualization platform to perform a real
task.
A paused state corresponds to a VM that has been instantiated but disabled to process a task or paused in
a waiting state.
A VM enters the suspended state if its machine file and virtual resources are stored back to the disk.
live migration of a VM from one machine to another consists of the following six steps:
Steps 0 and 1: Start migration. This step makes preparations for the migration, including determining the
migrating VM and the destination host
9
Steps 2: Transfer memory. Since the whole execution state of the VM is stored in memory, sending the VM’s
memory to the destination node ensures continuity of the service provided by the VM. All of the memory data is
transferred
Step 3: Suspend the VM and copy the last portion of the data. The migrating VM’s execution is suspended when
the last round’s memory data is transferred.
Steps 4 and 5: Commit and activate the new host. After all the needed data is copied, on the destination host, the
VM reloads the states and recovers the execution of programs in it, and the service provided by this VM
continues.
10
In data centers, a large number of heterogeneous workloads can run on servers at various times.
These heterogeneous workloads can be roughly divided into two categories: chatty workloads and
noninteractive workloads. Chatty workloads may burst at some point and return to a silent state at some
other point. A web video service is an example of this, whereby a lot of people use it at night and few
people use it during the day. Noninteractive workloads do not require people’s efforts to make progress
after they are submitted. Server consolidation is an approach to improve the low utility ratio of
hardware resources by reducing the number of physical servers. The use of VMs increases resource
management complexity. This causes a challenge
in terms of how to improve resource utilization as well as guarantee QoS in data centers.
Consolidation enhances hardware utilization. Many underutilized servers are consolidated into
fewer servers to enhance resource utilization. Consolidation also facilitates backup services and
disaster recovery.
This approach enables more agile provisioning and deployment of resources. In a virtual
environment, the images of the guest OSes and their applications are readily cloned and reused.
The total cost of ownership is reduced. In this sense, server virtualization causes deferred
purchases of new servers, a smaller data-center footprint, lower maintenance costs, and lower
power, cooling, and cabling requirements.
This approach improves availability and business continuity. The crash of a guest OS has no
effect on the host OS or any other guest OS.
To automate data-center operations, one must consider resource scheduling, architectural support,
power management, automatic or autonomic resource management, performance of analytical models,
and so on. In virtualized data centers, an efficient, on-demand, fine-grained scheduler is one of the key
factors to improve resource utilization. Scheduling and reallocations can be done in a wide range of
levels in a set of data centers. Dynamic CPU allocation is based on VM utilization and application-level
QoS metrics. One method considers both CPU and memory flowing as well as automatically adjusting
resource overhead based on varying workloads in hosted services. Another scheme uses a two-level
resource management system to handle the complexity involved. A local controller at the VM level and
a global controller at the server level are designed.
Virtual Storage Management
In system virtualization, virtual storage includes the storage managed by VMMs and guest
OSes. Generally, the data stored in this environment can be classified into two categories: VM images
and application data. The VM images are special to the virtual environment, while application data
includes all other data which is the same as the data in traditional OS environments. In data centers,
there are often thousands of VMs, which cause the VM images to
become flooded. The research is going on in this field to make management easy while enhancing
performance and reducing the amount of storage occupied by the VM images. Parallax is a distributed
storage system customized for virtualization environments. Content Addressable Storage (CAS) is a
solution to reduce the total size of VM images, and therefore supports a large set of VM-based systems
in data centers.
Parallax designs a novel architecture in which storage features that have traditionally been
implemented directly on high-end storage arrays and switchers are relocated into a federation of storage
VMs. These storage VMs share the same physical hosts as the VMs that they serve. Figure provides an
overview of the Parallax system architecture. It supports all popular system virtualization techniques,
such as paravirtualization and full virtualization. For each physical machine, Parallax customizes a
11
special storage appliance VM. The storage appliance VM acts as a block virtualization layer between
individual VMs and the physical storage device. It provides a virtual disk for each VM on the same
physical machine.
13
The VM-based IDS contains a policy engine and a policy module. The policy framework can
monitor events in different guest VMs by operating system interface library and PTrace indicates trace
to secure policy of monitored host. It’s difficult to predict and prevent all intrusions without delay. At
the time of this writing, most computer systems use logs to analyze attack actions, but it is hard to
ensure the credibility and integrity of a log. The IDS log service is based on the operating system
kernel. Thus, when an operating system is invaded by attackers, the log service should be unaffected.
Besides IDS, honeynets are also prevalent in intrusion detection. They attract and provide a fake
system view to attackers in order to protect the real system. In addition, the attack action can be
analyzed, and a secure IDS can be built. A honeypot is a purposely defective system that simulates an
operating system to cheat and monitor the actions of an attacker. A honeypot can be divided into
physical and virtual forms. A guest operating system and the applications running on it constitute a
VM. The host operating system and VMM must be guaranteed to prevent attacks from the VM in a
virtual honeypot.
14